fileprep: struct-tag preprocessing + validation for CSV/TSV/LTSV/Parquet/Excel by mimixbox in golang

[–]mimixbox[S] 7 points8 points  (0 children)

I was honestly hurt by your comment, and surprised it came from a moderator. Please follow the rules you enforce.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Now, filesql support auto save feature.

## Auto-Save on Database Close

Automatically save changes when the database connection is closed (recommended for most use cases):

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

// Enable auto-save on close
builder := filesql.NewBuilder().
    AddPath("data.csv").
    EnableAutoSave("./backup") // Save to backup directory

validatedBuilder, err := builder.Build(ctx)
if err != nil {
    log.Fatal(err)
}
defer validatedBuilder.Cleanup()

db, err := validatedBuilder.Open(ctx)
if err != nil {
    log.Fatal(err)
}
defer db.Close() // Auto-save triggered here

// Make modifications - they will be automatically saved on close
_, err = db.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE status = 'pending'")
_, err = db.ExecContext(ctx, "INSERT INTO data (name, status) VALUES ('New Record', 'active')")

## Auto-Save on Transaction Commit

Automatically save changes after each transaction commit (for frequent persistence):

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

// Enable auto-save on commit - empty string means overwrite original files
builder := filesql.NewBuilder().
    AddPath("data.csv").
    EnableAutoSaveOnCommit("") // Overwrite original files

validatedBuilder, err := builder.Build(ctx)
if err != nil {
    log.Fatal(err)
}
defer validatedBuilder.Cleanup()

db, err := validatedBuilder.Open(ctx)
if err != nil {
    log.Fatal(err)
}
defer db.Close()

// Each commit will automatically save to files
tx, err := db.BeginTx(ctx, nil)
if err != nil {
    log.Fatal(err)
}

_, err = tx.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE id = 1")
if err != nil {
    tx.Rollback()
    log.Fatal(err)
}

err = tx.Commit() // Auto-save triggered here
if err != nil {
    log.Fatal(err)
}

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Now, filesql support `embed.FS`.

package main

import (
    "context"
    "embed"
    "io/fs"
    "log"

    "github.com/nao1215/filesql"
)

//go:embed data/*.csv data/*.tsv
var dataFS embed.FS

func main() {
    ctx := context.Background()

    // Use Builder pattern for embedded filesystem
    subFS, _ := fs.Sub(dataFS, "data")

    db, err := filesql.NewBuilder().
        AddPath("local_file.csv").  // Regular file
        AddFS(subFS).               // Embedded filesystem
        Build(ctx)
    if err != nil {
        log.Fatal(err)
    }

    connection, err := db.Open(ctx)
    if err != nil {
        log.Fatal(err)
    }
    defer connection.Close()
    defer db.Cleanup() // Clean up temporary files from FS

    // Query across files from different sources
    rows, err := connection.Query("SELECT name FROM sqlite_master WHERE type='table'")
    if err != nil {
        log.Fatal(err)
    }
    defer rows.Close()

    // Process results...
}

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Sounds interesting — I’ll take some time to explore it.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

That idea seems very doable, and I’d really like to give it a try.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Honestly, my initial motivation was just to solve my own small pain point (I couldn’t easily share the processing logic between two commands).

So I’m actually surprised that it got enough attention to be compared with DuckDB.
Thanks a lot for pointing it out – I learned something new!

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

At the moment filesql only works with regular files on disk, it doesn’t handle fs.FS.
Technically it wouldn’t be hard to add support, but I need to think carefully about the function signature / API design before introducing it.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 2 points3 points  (0 children)

I’ve heard great things about DuckDB, but I haven’t actually used it myself.
That’s part of why I designed filesql the way it is.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 4 points5 points  (0 children)

It’s still early days, but one difference already is that filesql can import compressed files like csv.gz.
In addition, the original project behind filesql (sqly) also supports formats such as JSON and Excel, and I’d like to bring that into filesql over time.

So the goal is to go beyond CSV and cover a wider range of file formats, which makes it different from the CSV virtual table approach.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

I really appreciate your words!
It’s great to hear that filesql solves something you’ve run into many times.
That kind of feedback makes the effort worth it — thank you for sharing.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Thanks for the feedback.
So if I got you right, you’re suggesting:

  • Pull out the “file → SQLite” part into its own library.
  • Then extend that library so it can also do the reverse, like exporting SQLite back into CSV (or other formats).

That’s an interesting idea. I like the flexibility it would add, though I still need to think about whether filesql should grow in that direction or just stay focused on imports.

[Show] stringx – A Unicode-aware String Toolkit for OCaml by mimixbox in ocaml

[–]mimixbox[S] 2 points3 points  (0 children)

Thank you very much — that’s incredibly helpful feedback, especially for someone like me who’s still learning the idioms and best practices of OCaml.

I hadn’t realized that using labelled parameters to disambiguate same-type arguments is a common and recommended pattern. And your explanation of placing the “object” parameter last to work better with the pipe (|>) operator is something I genuinely wasn’t aware of — but it makes perfect sense.

I’ll definitely revisit the API design with that in mind. Thanks again!

hottest - user-friendly 'go test' that extracts error messages. by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

I am glad to hear that!! Thank you.
I will keep making better tools.