Google Cloud
Google Cloud
Integrate Google Cloud Platform services with Kestra data workflows.
Authentication
All tasks must be authenticated for the Google Cloud Platform. You can do it in multiple ways:
- By setting the task
serviceAccount
property that must contain the service account JSON content. It can be handy to set this property globally by using task defaults if your cluster access only one GCP project. - By setting the
GOOGLE_APPLICATION_CREDENTIALS
environment variable on the nodes running Kestra. It must point to an application credentials file. Warning: it must be the same on all worker nodes and can cause some security concerns. - If none is set, the default service account will be used.
You can also set authentication scopes. By default only one scope is used: https://www.googleapis.com/auth/cloud-platform
.
Common property
Each task allows configuring the GCP project identifier in the projectId
property. If not set, the default project identifier will be used (the one returned by ServiceOptions.getDefaultProjectId()
). It can be handy to set this property globally by using plugin defaults if your cluster access only one GCP project.
BigQuery
This sub-group of plugins contains tasks for accessing Google Cloud BigQuery. BigQuery is a completely serverless and cost-effective enterprise data warehouse.
Triggers
Tasks
- Copy
- CopyPartitions
- CreateDataset
- CreateTable
- DeleteDataset
- DeletePartitions
- DeleteTable
- ExtractToGcs
- Load
- LoadFromGcs
- Query
- StorageWrite
- TableMetadata
- UpdateDataset
- UpdateTable
Pub/Sub
This sub-group of plugins contains tasks for accessing Google Cloud Pub/Sub. Pub/Sub is an asynchronous and scalable messaging service that decouples services producing messages from services processing those messages.
Triggers
Tasks
Cloud Storage (GCS)
This sub-group of plugins contains tasks for accessing Google Cloud Storage (GCS). Cloud Storage is a managed service for storing unstructured data.
Triggers
Tasks
- Compose
- Copy
- CreateBucket
- CreateBucketIamPolicy
- Delete
- DeleteBucket
- DeleteList
- Download
- Downloads
- List
- UpdateBucket
- Upload
Dataproc Batches
This sub-group of plugins contains tasks for submitting batches on Google Cloud Dataproc. Dataproc is a managed Apache Spark and Apache Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning.
Tasks
Dataproc Clusters
This sub-group of plugins contains tasks to manipulate clusters on Google Cloud Dataproc. Dataproc is a managed Apache Spark and Apache Hadoop service that lets you take advantage of open source data tools for batch processing, querying, streaming, and machine learning.
Tasks
Cli
Tasks
Google Cloud Function
This sub-group of plugins contains tasks for triggering Google Cloud Function. Cloud Run functions automatically manages and scales underlying infrastructure with the size of workload. Deploy your code and let Google run and scale it for you.
Tasks
Firestore
This sub-group of plugins contains tasks for accessing Google Cloud Firestore. Firestore is a flexible, scalable NoSQL cloud database.
Tasks
Vertex AI
This sub-group of plugins contains tasks for accessing Google Cloud Vertex AI. Vertex AI allows to build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case.
Tasks
Kubernetes Engine (GKE)
This sub-group of plugins contains tasks for accessing Google Kubernetes Engine (GKE). Kubernetes Engine is a scalable and fully automated Kubernetes service.
Tasks
Authentication
This sub-group of plugins contains tasks to manage authentication for Google Cloud.
Tasks
Was this page helpful?