Blueprints
Extract data from a REST API, process it in Python with Polars in a Docker container, then run DuckDB query and preview results as a table in the Outputs tab
About this blueprint
Python SQL Outputs API DuckDB
This flow will download a file from a REST API, process it with Python and SQL, and store the result in the internal storage.
- the
api
http Request task uses a public API — to interact with a private API endpoint, check the task documentation for examples on how to authenticate your request - the
python
task runs in a separate Docker container and installspolars
before starting the script - the DuckDB query task uses data from a previous task and outputs the query result to internal storage.
The Outputs tab provides a well-formatted table with the query results.
yaml
id: api_python_sql
namespace: company.team
tasks:
- id: api
type: io.kestra.plugin.core.http.Request
uri: https://dummyjson.com/products
- id: python
type: io.kestra.plugin.scripts.python.Script
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: python:slim
beforeCommands:
- pip install polars
outputFiles:
- products.csv
warningOnStdErr: false
script: |
import polars as pl
data = {{ outputs.api.body | jq('.products') | first }}
df = pl.from_dicts(data)
df.glimpse()
df.select(["brand", "price"]).write_csv("products.csv")
- id: sql_query
type: io.kestra.plugin.jdbc.duckdb.Query
inputFiles:
in.csv: "{{ outputs.python.outputFiles['products.csv'] }}"
sql: |
SELECT brand, round(avg(price), 2) as avg_price
FROM read_csv_auto('{{ workingDir }}/in.csv', header=True)
GROUP BY brand
ORDER BY avg_price DESC;
store: true