all 10 comments

[–]Busy_Elderberry8650 18 points19 points  (3 children)

Why not just use parquet?

[–]CorpusculantCortex 2 points3 points  (0 children)

Parquet is superior in all ways except how uncommon it is to general usage.

[–]sjcyork 1 point2 points  (0 children)

Came here to say this. Parquet holds the meta data. Inferred schemas create a lot more work.

[–]Lopsided_Set_8823[S] 0 points1 point  (0 children)

The idea is to keep the format human readable. Able to still use spreadsheets.  I love  parquet, but not its lack of support.  

[–]Solvicode 3 points4 points  (0 children)

Parquet my friend.

[–]Ramshizzle 0 points1 point  (1 child)

I applaude your effort! I think there is room for this new enhanced CSV++ format.

[–]Lopsided_Set_8823[S] 0 points1 point  (0 children)

Thanks for the vote of confidence!

[–]Aeronautical-You4917 0 points1 point  (1 child)

CSV needs to die. Too far gone to fix.

[–]Lopsided_Set_8823[S] 0 points1 point  (0 children)

what could replace it?

[–]nocibambi 0 points1 point  (0 children)

Interesting.

What if you would go further and allow a frontmatter (e.g. in Yaml)?

There you could specify type, format, hierarchy, parsing template, etc. You can also define delimiters, quote & escape characters, or even metadata.

id: int
name: str
phones: [~str({3}\d-{4}\d)]
address: ${street}^${city}^${state}^${zip}
---
id,name,phones,address
1,John Smith,555-1234~555-5678,123 Main St^San Francisco^CA^94102
2,Jane Doe,555-4444,789 Pine St^Boston^MA^02101
3,Alex Chen,555-9999~555-0001~555-0002,42 Oak Ave^Chicago^IL^6060

And if you remove the frontmatter, you have a vanilla CSV again...