update document

liunaijie · Jun 24, 2024 · 01cb36d · 01cb36d
1 parent e3addf7
commit 01cb36d
Show file tree

Hide file tree

Showing 26 changed files with 105 additions and 33 deletions.
diff --git a/README.md b/README.md
@@ -61,7 +61,7 @@ SeaTunnel addresses common data integration challenges:
 
 ## SeaTunnel Workflow
 
-![SeaTunnel Workflow](docs/en/images/architecture_diagram.png)
+![SeaTunnel Workflow](docs/images/architecture_diagram.png)
 
 Configure jobs, select execution engines, and parallelize data using Source Connectors. Easily develop and extend connectors to meet your needs.
 

diff --git a/docs/en/about.md b/docs/en/about.md
@@ -34,7 +34,7 @@ SeaTunnel focuses on data integration and data synchronization, and is mainly de
 
 ## SeaTunnel work flowchart
 
-![SeaTunnel work flowchart](images/architecture_diagram.png)
+![SeaTunnel work flowchart](../images/architecture_diagram.png)
 
 The runtime process of SeaTunnel is shown in the figure above.
 

diff --git a/docs/en/faq.md b/docs/en/faq.md
@@ -65,9 +65,9 @@ Refer to: [lightbend/config#456](https://github.com/lightbend/config/issues/456)
 
 Of course! See the screenshot below:
 
-![workflow.png](images/workflow.png)
+![workflow.png](../images/workflow.png)
 
-![azkaban.png](images/azkaban.png)
+![azkaban.png](../images/azkaban.png)
 
 ## Does SeaTunnel have a case for configuring multiple sources, such as configuring elasticsearch and hdfs in source at the same time?
 
@@ -184,7 +184,7 @@ The following conclusions can be drawn:
 
 3. In general, both M and N are determined, and the conclusion can be drawn from 2: The size of `spark.streaming.kafka.maxRatePerPartition` is positively correlated with the size of `spark.executor.cores` * `spark.executor.instances`, and it can be increased while increasing the resource `maxRatePerPartition` to speed up consumption.
 
-![kafka](images/kafka.png)
+![kafka](../images/kafka.png)
 
 ## How can I solve the Error `Exception in thread "main" java.lang.NoSuchFieldError: INSTANCE`?
 

diff --git a/docs/en/seatunnel-engine/resource-manager.md b/docs/en/seatunnel-engine/resource-manager.md
@@ -3,8 +3,11 @@
 sidebar_position: 9
 -------------------
 
-In version 2.3.6. SeaTunnel can add tag to worker node, when you submit job you can specify the tag you want to run.  
-update the config in `hazelcast.yaml`,
+After version 2.3.6. SeaTunnel can add `tag` to each worker node, when you submit job you can use `tag_filter` to filter the node you want run this job.
+
+# How to archive this:
+
+1. update the config in `hazelcast.yaml`,
 
 ```yaml
 hazelcast:
@@ -40,13 +43,14 @@ hazelcast:
 ```
 
 In this config, we specify the tag by `member-attributes`, the node has `group=platform, team=team1` tags.
-Then, when we use this job config to submit job, we can assign the task to this node.
+
+2. add `tag_filter` to your job config
 
 ```hacon
 env {
   parallelism = 1
   job.mode = "BATCH"
-  tag {
+  tag_filter {
     group = "platform"
     team = "team1"
   }
@@ -72,10 +76,8 @@ sink {
 ```
 
 **Notice:**
-- If not set this tag in config, it will choose the node in all active nodes.
-- In you input a not exist tag, like `group=platform, team=team2`, you will get `NoEnoughResourceException` exception.
-- if you special multiple tag, it needs all tag exist and value match,  you can add multiple tags to node, but only use few tag to filter node.
-like only use `group=platform`
+- If not set `tag_filter` in job config, it will random choose the node in all active nodes.
+- When you add multiple tag in `tag_filter`, it need all key exist and value match. if all node not match, you will get `NoEnoughResourceException` exception.
 
-![img.png](resource_tag.png)
+![img.png](../../images/resource_tag.png)
 
diff --git a/docs/en/seatunnel-engine/resource_tag.png b/docs/en/seatunnel-engine/resource_tag.png
diff --git a/docs/en/images/architecture_diagram.png → docs/images/architecture_diagram.png b/docs/en/images/architecture_diagram.png → docs/images/architecture_diagram.png
diff --git a/docs/en/images/azkaban.png → docs/images/azkaban.png b/docs/en/images/azkaban.png → docs/images/azkaban.png
diff --git a/docs/en/images/checkstyle.png → docs/images/checkstyle.png b/docs/en/images/checkstyle.png → docs/images/checkstyle.png
diff --git a/docs/en/images/kafka.png → docs/images/kafka.png b/docs/en/images/kafka.png → docs/images/kafka.png
diff --git a/docs/images/resource_tag.png b/docs/images/resource_tag.png
diff --git a/docs/en/images/seatunnel-workflow.svg → docs/images/seatunnel-workflow.svg b/docs/en/images/seatunnel-workflow.svg → docs/images/seatunnel-workflow.svg
diff --git a/docs/en/images/seatunnel_architecture.png → docs/images/seatunnel_architecture.png b/docs/en/images/seatunnel_architecture.png → docs/images/seatunnel_architecture.png
diff --git a/docs/en/images/seatunnel_starter.png → docs/images/seatunnel_starter.png b/docs/en/images/seatunnel_starter.png → docs/images/seatunnel_starter.png
diff --git a/docs/en/images/workflow.png → docs/images/workflow.png b/docs/en/images/workflow.png → docs/images/workflow.png
diff --git a/docs/sidebars.js b/docs/sidebars.js
@@ -178,7 +178,8 @@ const sidebars = {
                 "seatunnel-engine/checkpoint-storage",
                 "seatunnel-engine/rest-api",
                 "seatunnel-engine/tcp",
-                "seatunnel-engine/engine-jar-storage-mode"
+                "seatunnel-engine/engine-jar-storage-mode",
+                "seatunnel-engine/resource-manager",
             ]
         },
         {

diff --git a/docs/zh/images/architecture_diagram.png b/docs/zh/images/architecture_diagram.png
diff --git a/docs/zh/images/azkaban.png b/docs/zh/images/azkaban.png
diff --git a/docs/zh/images/checkstyle.png b/docs/zh/images/checkstyle.png
diff --git a/docs/zh/images/kafka.png b/docs/zh/images/kafka.png
diff --git a/docs/zh/images/seatunnel-workflow.svg b/docs/zh/images/seatunnel-workflow.svg
diff --git a/docs/zh/images/seatunnel_architecture.png b/docs/zh/images/seatunnel_architecture.png
diff --git a/docs/zh/images/seatunnel_starter.png b/docs/zh/images/seatunnel_starter.png
diff --git a/docs/zh/images/workflow.png b/docs/zh/images/workflow.png
diff --git a/docs/zh/seatunnel-engine/resource-manager.md b/docs/zh/seatunnel-engine/resource-manager.md
@@ -0,0 +1,83 @@
+---
+
+sidebar_position: 9
+-------------------
+
+在2.3.6版本之后, SeaTunnel支持对每个实例添加`tag`, 然后在提交任务时可以在配置文件中使用`tag_filter`来选择任务将要运行的节点.
+
+# 如何实现改功能
+
+1. 更新`hazelcast.yaml`文件
+
+```yaml
+hazelcast:
+  cluster-name: seatunnel
+  network:
+    rest-api:
+      enabled: true
+      endpoint-groups:
+        CLUSTER_WRITE:
+          enabled: true
+        DATA:
+          enabled: true
+    join:
+      tcp-ip:
+        enabled: true
+        member-list:
+          - localhost
+    port:
+      auto-increment: false
+      port: 5801
+  properties:
+    hazelcast.invocation.max.retry.count: 20
+    hazelcast.tcp.join.port.try.count: 30
+    hazelcast.logging.type: log4j2
+    hazelcast.operation.generic.thread.count: 50
+  member-attributes:
+    group:
+      type: string
+      value: platform
+    team:
+      type: string
+      value: team1
+```
+
+在这个配置中, 我们通过`member-attributes`设置了`group=platform, team=team1`这样两个`tag`
+
+2. 在任务的配置中添加`tag_filter`来选择你需要运行该任务的节点
+
+```hacon
+env {
+  parallelism = 1
+  job.mode = "BATCH"
+  tag_filter {
+    group = "platform"
+    team = "team1"
+  }
+}
+source {
+  FakeSource {
+    result_table_name = "fake"
+    parallelism = 1
+    schema = {
+      fields {
+        name = "string"
+      }
+    }
+  }
+}
+transform {
+}
+sink {
+  console {
+    source_table_name="fake"
+  }
+}
+```
+
+**注意:**
+- 当在任务的配置中, 没有添加`tag_filter`时, 会从所有节点中随机选择节点来运行任务.
+- 当`tag_filter`中存在多个过滤条件时, 会根据key存在以及value相等的全部匹配的节点, 当没有找到匹配的节点时, 会抛出 `NoEnoughResourceException`异常.
+
+![img.png](../../images/resource_tag.png)
+
diff --git a/seatunnel-engine/seatunnel-engine-common/src/main/resources/hazelcast.yaml b/seatunnel-engine/seatunnel-engine-common/src/main/resources/hazelcast.yaml
@@ -38,11 +38,4 @@ hazelcast:
     hazelcast.invocation.max.retry.count: 20
     hazelcast.tcp.join.port.try.count: 30
     hazelcast.logging.type: log4j2
-    hazelcast.operation.generic.thread.count: 50
-  member-attributes:
-    group:
-      type: string
-      value: platform
-    team:
-      type: string
-      value: team1
+    hazelcast.operation.generic.thread.count: 50
diff --git a/...main/java/org/apache/seatunnel/engine/server/resourcemanager/AbstractResourceManager.java b/...main/java/org/apache/seatunnel/engine/server/resourcemanager/AbstractResourceManager.java
@@ -150,14 +150,11 @@ public CompletableFuture<List<SlotProfile>> applyResources(
                                         boolean match = true;
                                         for (Map.Entry<String, String> entry :
                                                 tagFilter.entrySet()) {
-                                            if (workerAttr.containsKey(entry.getKey())
-                                                    && workerAttr
+                                            if (!workerAttr.containsKey(entry.getKey())
+                                                    || !workerAttr
                                                             .get(entry.getKey())
                                                             .equals(entry.getValue())) {
-                                                // need all tag match
-                                            } else {
-                                                match = false;
-                                                break;
+                                                return false;
                                             }
                                         }
                                         return match;