You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to achieve true horizontal scaling, datagovteam wants to be able to launch multiple instances of the airflow-worker application and see them pick up queued work.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
GIVEN I run cf push --vars-file my_vars_file
AND I have configured the datagov-harvester manifest to launch multiple instances of the worker application
WHEN I look at the "Cluster Activity" tab in Airflow UI
THEN I will see that it is queuing up new work and that the new queued work is getting picked up and run by a worker node
Background
Currently, launching more than one instance of the airflow-worker application causes the worker instances not to pick up work, whereas a single instance has no issues.
Considering the Celery documentation, this may be mitigated by launching the worker instances with a hostname:
You can start multiple workers on the same machine, but be sure to name each individual worker by specifying a node name with the --hostname argument:
Determine if we can launch the worker instances using the .profile and supply them with a unique start command, similar to airflow celery worker -n worker-{CF_INSTANCE_INDEX} using CF Env vars
.profile runs before the manifest, and can be used to prep custom ENV VARS for use in the manifest
CF env vars are always available in the manifest with no manipulation in the profile
thus, it is possible to run this command airflow celery worker -n worker-{CF_INSTANCE_INDEX} with ease
however... this command does not map 1-to-1 with the celery command for adding a hostname, and there is an issue with binding to the same port for logs
this may be mitigated when/if we decide to enable remote logging to S3
and, it may be recommended to allow celery workers to auto-scale on a single machine vs scaling the instances based on this comment from an airflow core maintainer
so, our best path may to distribute DAG execution over the entire day to ensure that load remains steady and to tune the instance's memory as best we can using external monitoring.
User Story
In order to achieve true horizontal scaling, datagovteam wants to be able to launch multiple instances of the airflow-worker application and see them pick up queued work.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
cf push --vars-file my_vars_file
AND I have configured the datagov-harvester manifest to launch multiple instances of the worker application
WHEN I look at the "Cluster Activity" tab in Airflow UI
THEN I will see that it is queuing up new work and that the new queued work is getting picked up and run by a worker node
Background
Currently, launching more than one instance of the airflow-worker application causes the worker instances not to pick up work, whereas a single instance has no issues.
Considering the Celery documentation, this may be mitigated by launching the worker instances with a hostname:
Determine if we can launch the worker instances using the
.profile
and supply them with a unique start command, similar toairflow celery worker -n worker-{CF_INSTANCE_INDEX}
using CF Env varsSecurity Considerations (required)
[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]
Sketch
.profile
and supply them with a unique start commandThe text was updated successfully, but these errors were encountered: