Airflow kubernetes worker

8/16/2023

With new Airflow and CeleryExecutor, you can utilize KEDA to handle auto-scaling of workers according to the number of tasks in the queue. That means that some workers are running without processing anything and so you are paying for nothing! This would be fine, as long as you don't mind unnecessary long waiting times, the main downside is when only a small number of tasks are scheduled or none at all. If too many tasks are scheduled, they would just wait until they are finally processed.

In the simplest deployment, we could spawn a fixed number of workers with fixed number of tasks allowed to be run in parallel on each of them. You are able to set a limit of maximal tasks running on one worker with the environment variable AIRFLOW_CELERY_WORKER_CONCURRENCY. They manage one to many CeleryD processes to execute the desired tasks. CeleryExecutor and KEDAĬelery is the default executor in Airflow's Helm chart.

When you create a task, it's sent to the queue to which workers are listening and can pick up the task to do.Īirflow's executors are the mechanism by which task instances are run.ĭepending on your needs, you can select fromĪs the name suggests, local executors are great for debugging purposes on your localhost, they are not meant for the production usage.įrom the production options, we will look at the Celery and Kubernetes executor along with the Kubernetes pod operator. They are daemons that actually execute the logic of tasks. Workers are an essential part of Airflow. In a previous post, we deployed Kubernetes cluster in AWS and Airflow on Kubernetes cluster (EKS) using Fargate nodes. Want to get up and running fast in the cloud? Contact us today.

0 Comments

Airflow kubernetes worker

Leave a Reply.

Author

Archives

Categories