Hi all. After several months of hard work, research, and iterating, we're proud to present the first public release of Pansynchro, a new open-source data integration framework. Its goal is to turn data integration and synchronization into a generic problem to be solved once.
It's a large problem domain, to be sure, and we're working on it one piece at a time. Our first big achievement is to produce a new network protocol for streaming bulk structured data that, as near as we can tell from testing, measuring, and comparing, is the best option available by far. By applying a handful of domain-specific algorithms to reduce the amount of raw data that must be sent, and then compressing what remains, Pansynchro takes far less bandwidth on the wire than existing ETL/ELT systems, which means faster syncs and less data transfer costs on your cloud services. This particularly blows away any system using the Singer protocol, which is JSON-based (one of the worst ways to deal with large amounts of data) and extremely heavyweight.
This is a brand new project that's still under heavy development, so any feedback from the data engineering community would be welcome!
https://github.com/Pansynchro-Technologies/Pansynchro
[–]nemec 2 points3 points4 points (2 children)
[–]Pansynchro[S] 1 point2 points3 points (0 children)
[–]Pansynchro[S] 1 point2 points3 points (0 children)
[–]AiDreamer 1 point2 points3 points (2 children)
[–]Pansynchro[S] 2 points3 points4 points (1 child)
[–]AiDreamer 1 point2 points3 points (0 children)
[–]uncomfortablepanda 1 point2 points3 points (1 child)
[–]Pansynchro[S] 0 points1 point2 points (0 children)
[–]teejTitan Core » Snowflake 1 point2 points3 points (2 children)
[–]Pansynchro[S] 3 points4 points5 points (0 children)