Blueprints

Schedule a Python data ingestion job to extract data from an API and load it to DuckDB using dltHub (data load tool)

About this blueprint

Ingest Data Schedule

This flow loads data from the Chess.com API into DuckDB destination. The flow is scheduled to run daily at 9 AM.

yaml
id: dlt
namespace: company.team
tasks:
  - id: chess_api_to_duckdb
    type: io.kestra.plugin.scripts.python.Script
    taskRunner:
      type: io.kestra.plugin.scripts.runner.docker.Docker
    containerImage: python:slim
    beforeCommands:
      - pip install dlt[duckdb]
    warningOnStdErr: false
    script: |
      import dlt
      import requests

      pipeline = dlt.pipeline(
          pipeline_name='chess_pipeline',
          destination='duckdb',
          dataset_name='player_data'
      )
      data = []
      for player in ['magnuscarlsen', 'rpragchess']:
          response = requests.get(f'https://api.chess.com/pub/player/{player}')
          response.raise_for_status()
          data.append(response.json())
      # Extract, normalize, and load the data
      pipeline.run(data, table_name='player')
triggers:
  - id: daily
    type: io.kestra.plugin.core.trigger.Schedule
    disabled: true
    cron: 0 9 * * *

Script

Docker

Schedule

New to Kestra?

Use blueprints to kickstart your first workflows.

Get started with Kestra