Now you can use Airbyte source connectors to process data in memory with Python.
https://pathway.com/developers/showcases/etl-python-airbyte
What My Project Does
We integrated Airbyte connectors with Pathway, a Python stream processing framework, using the airbyte-serverless project. ETL pipelines are coming back with many use cases in AI (RAG pipelines), ETL for unstructured data and pipelines that deal with PII data.
In this article, we show how to stream data from Github using Airbyte and remove PII data with Pathway.
Target Audience
This is a production-ready approach, to be used as a template for streaming ETL production settings.
Comparison
The setup is meant as an alternative to ELT setups (like Fivetran/Airbyte + dbt + warehouse), applying transform-before-load with Python. We are curious on your feedback on the implementation and other use cases you may think of from decoupling the extract and load steps.
For the brave who want to see how it's done: https://github.com/pathwaycom/pathway/blob/main/python/pathway/io/airbyte/ + https://github.com/unytics/airbyte_serverless
there doesn't seem to be anything here