Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule Node Scripts: delight-nashorn-sandbox ScriptCPUAbuseException #11077

Closed
rpcai opened this issue Jun 24, 2024 · 10 comments · Fixed by #11318
Closed

Rule Node Scripts: delight-nashorn-sandbox ScriptCPUAbuseException #11077

rpcai opened this issue Jun 24, 2024 · 10 comments · Fixed by #11318
Assignees
Labels
bug confirmed Confirmed bug

Comments

@rpcai
Copy link

rpcai commented Jun 24, 2024

Describe the bug
Since 3.7.0 (lengthy) rule-node scripts which were working in 3.6.4 now fail:
Can't compile script: delight.nashornsandbox.exceptions.ScriptCPUAbuseException: Regular expression running for too many iterations. The operation could NOT be gracefully interrupted.

This appears to be directly related to a commit where delight-nashorn-sandbox version is changed from 0.2.1 to 0.4.2 cd722a1

delight-nashorn-sandbox has an open bug, introduced at 0.3.1, which appears to fail-to-parse certain structures: javadelight/delight-nashorn-sandbox#151

Your Server Environment

  • Deployment: monolith
  • Deployment type: deb, rpm, exe, docker-compose, k8s, ami
  • ThingsBoard Version: 3.7.0
  • Professional Edition
  • Ubuntu

Your Client Environment

  • OS: Windows 11
  • Browser: Edge

To Reproduce
Steps to reproduce the behavior:

  1. Create a new rule chain transformation script node, Javascript, with the following code:
var variable = {
0:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
1:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
2:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
3:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
4:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
5:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
6:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
7:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
8:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
9:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
10:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
11:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
12:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
13:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
14:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
15:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
16:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
17:{ status: 'ABCDEF', statusName: 'ABCDEF' },
18:{ status: 'ABCDEF', statusName: 'ABCDEF' },
19:{ status: 'ABCDEF', statusName: 'ABCDE' }
};
//This is a Comment
return msg;
  1. Test the functon. Observe that it runs OK
  2. Add at least one single non-whitespace charater to the script
  3. Test the function. Observe that it fails.

Additional context
As a workaround, changing environment the variable in thingsboard.conf USE_LOCAL_JS_SANDBOX = false appears to mitigate the error. However, we dont want to do this in production.

@pon0marev
Copy link
Contributor

Attach a sample code that causes an error in the script node of the rules engine.
However, if it is a bug in the nashorn library, then the thingsboard development team will not be able to fix it on their end. The solution is to wait for a new nashorn version with a fix or install the remote JS executors. If you are not familiar with installing JS executors then I can provide installation instructions.

@rpcai
Copy link
Author

rpcai commented Jun 27, 2024

I have replicated the issue in another environment

Debian 11 / Docker, dockerfile below:

version: '3.7'
services:
  mytb:
    restart: always
    image: "thingsboard/tb-postgres"
    ports:
      - "8080:9090"
      - "1883:1883"
      - "7070:7070"
      - "5683-5688:5683-5688/udp"
    environment:
      TB_QUEUE_TYPE: in-memory
    volumes:
      - tb_data:/data
      - tb_logs:/var/log/thingsboard  
      - tb_conf:/usr/share/thingsboard/conf

#defined portainer volumes
volumes:
  tb_data:
  tb_logs:
  tb_conf:

Rule Node Code that works:

var variable = {
0:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
1:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
2:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
3:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
4:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
5:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
6:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
7:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
8:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
9:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
10:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
11:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
12:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
13:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
14:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
15:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
16:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
17:{ status: 'ABCDEF', statusName: 'ABCDEF' },
18:{ status: 'ABCDEF', statusName: 'ABCDEF' }
};
var foo = Object.assign({},variable);
//This is a Comment
return msg;

Rule node code that fails:

var variable = {
0:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
1:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
2:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
3:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
4:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
5:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
6:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
7:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
8:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
9:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
10:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
11:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
12:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
13:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
14:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
15:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
16:{ status: 'ABCDEF12345', statusName: 'ABCDEF12345' },
17:{ status: 'ABCDEF', statusName: 'ABCDEF' },
18:{ status: 'ABCDEF', statusName: 'ABCDEF' }
};
var foo = Object.assign({},variable);
//This is a Comment
//This comment is too long
return msg;

image

changing thingsboard.yml variable USE_LOCAL_JS_SANDBOX to false (from default true), resolves the issue:
image
However, I also notice that certain JS snytax now fails (e.g Object.assign) - ES5?
image

Questions:

  1. What are implications of USE_LOCAL_JS_SANDBOX = false?
  2. Will remote JS executors fix the issue, if the root cause is nashorn library? will they allow use of ES6+ syntax??
  3. Can remote JS executors co-exist with monolithic deployment?

@ant2alex
Copy link

We are experiencing the same issue, most of our MQTT Integrations are showing Errors due to this issue. Changing the USE_LOCAL_JS_SANDBOX to "false" resolved it (temporarly) but at what other possible issues?

@pon0marev
Copy link
Contributor

The sandbox protects your instance from malicious scripts. Therefore, you must ensure that there are no faulty scripts in your converters and rule chains that could cause processing issues. Sendbox also protects from infinite loops or JS injections.

JS executors will fix this issue since it runs on nodeJS and doesn't use nashorn. I can't ensure that JS executors support ES6+, but they support a lot more fucntions than nashorn.

JS executors use kafka to communicate with thingsboard . So for minimal deployment you will need thingsboard monolith, kafka + zookeeper or kafka kraft, js executors.

To connect remote js executors do the following (example for installation as ubuntu service):
Add a next string to /usr/share/thingsboard/conf/thingsboard.conf

export JS_EVALUATOR=remote
export TB_QUEUE_TYPE=kafka
export TB_QUEUE_KAFKA_COMPRESSION=gzip
export TB_KAFKA_SERVERS=localhost:9092

Create docker-compose.yml file:

version: '3.2'
services:
  kafka:
    restart: always
    image: bitnami/kafka:3.5.2
    ports:
      - 9092:9092 #to localhost:9092 from host machine
      - 9093 #for Kraft
      - 9094 #to kafka:9094 from within Docker network
    environment:
      ALLOW_PLAINTEXT_LISTENER: "yes"
      KAFKA_CFG_LISTENERS: "OUTSIDE://:9092,CONTROLLER://:9093,INSIDE://:9094"
      KAFKA_CFG_ADVERTISED_LISTENERS: "OUTSIDE://localhost:9092,INSIDE://kafka:9094"
      KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP: "INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT,CONTROLLER:PLAINTEXT"
      KAFKA_CFG_INTER_BROKER_LISTENER_NAME: "INSIDE"
      KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE: "false"
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: "1"
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: "1"
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: "1"
      KAFKA_CFG_PROCESS_ROLES: "controller,broker" #KRaft
      KAFKA_CFG_NODE_ID: "0" #KRaft
      KAFKA_CFG_CONTROLLER_LISTENER_NAMES: "CONTROLLER" #KRaft
      KAFKA_CFG_CONTROLLER_QUORUM_VOTERS: "0@kafka:9093" #KRaft
    volumes:
      - kafka-data:/bitnami
  tb-js-executor:
    restart: always
    image: "thingsboard/tb-js-executor:3.7.0"
    scale: 5
    environment:
      REMOTE_JS_EVAL_REQUEST_TOPIC: js_eval.requests
      LOGGER_LEVEL: info
      LOG_FOLDER: logs
      LOGGER_FILENAME: tb-js-executor-%DATE%.log
      DOCKER_MODE: "true"
      SCRIPT_BODY_TRACE_FREQUENCY: 1000
      NODE_OPTIONS: "--max-old-space-size=200"
      MAX_ACTIVE_SCRIPTS: "4000"
      TB_QUEUE_TYPE: kafka
      TB_QUEUE_KAFKA_COMPRESSION: "gzip"
      TB_KAFKA_BATCH_SIZE: "128"
      TB_KAFKA_SERVERS: kafka:9094
volumes:
    kafka-data:
      driver: local

Start docker compose and restart thingsboard to apply new setthings.

@hkecho
Copy link

hkecho commented Jul 18, 2024

Any workaround, met the same issue.

@pon0marev
Copy link
Contributor

@hkecho see comments above: #11077 (comment) #11077 (comment)

@BiggiePete
Copy link

BiggiePete commented Jul 26, 2024

This is still an issue with 3.7.0.
#11077 works as of posting

@mxro
Copy link

mxro commented Jul 27, 2024

Patch for sandbox available that passes the provided example, see javadelight/delight-nashorn-sandbox#151 (comment)

Note issue may still appear with some complex scripts. Any further issues, please reach out!

@ViacheslavKlimov ViacheslavKlimov added confirmed Confirmed bug and removed unconfirmed Unconfirmed bug labels Jul 30, 2024
@mxro
Copy link

mxro commented Aug 10, 2024

Note due to the great work for @busterace there is a version available with a better fix that also generally improves performance by removing the need for running the regular expressions by using an AST library:

<dependency>
    <groupId>org.javadelight</groupId>
    <artifactId>delight-nashorn-sandbox</artifactId>
    <version>0.5.0</version>
</dependency>

@pon0marev
Copy link
Contributor

Bugfix implemented in version 3.8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug confirmed Confirmed bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants