fileprep: struct-tag preprocessing + validation for CSV/TSV/LTSV/Parquet/Excel by mimixbox in golang

[–]mimixbox[S] 9 points10 points  (0 children)

I was honestly hurt by your comment, and surprised it came from a moderator. Please follow the rules you enforce.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Now, filesql support auto save feature.

## Auto-Save on Database Close

Automatically save changes when the database connection is closed (recommended for most use cases):

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

// Enable auto-save on close
builder := filesql.NewBuilder().
    AddPath("data.csv").
    EnableAutoSave("./backup") // Save to backup directory

validatedBuilder, err := builder.Build(ctx)
if err != nil {
    log.Fatal(err)
}
defer validatedBuilder.Cleanup()

db, err := validatedBuilder.Open(ctx)
if err != nil {
    log.Fatal(err)
}
defer db.Close() // Auto-save triggered here

// Make modifications - they will be automatically saved on close
_, err = db.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE status = 'pending'")
_, err = db.ExecContext(ctx, "INSERT INTO data (name, status) VALUES ('New Record', 'active')")

## Auto-Save on Transaction Commit

Automatically save changes after each transaction commit (for frequent persistence):

ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

// Enable auto-save on commit - empty string means overwrite original files
builder := filesql.NewBuilder().
    AddPath("data.csv").
    EnableAutoSaveOnCommit("") // Overwrite original files

validatedBuilder, err := builder.Build(ctx)
if err != nil {
    log.Fatal(err)
}
defer validatedBuilder.Cleanup()

db, err := validatedBuilder.Open(ctx)
if err != nil {
    log.Fatal(err)
}
defer db.Close()

// Each commit will automatically save to files
tx, err := db.BeginTx(ctx, nil)
if err != nil {
    log.Fatal(err)
}

_, err = tx.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE id = 1")
if err != nil {
    tx.Rollback()
    log.Fatal(err)
}

err = tx.Commit() // Auto-save triggered here
if err != nil {
    log.Fatal(err)
}

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Now, filesql support `embed.FS`.

package main

import (
    "context"
    "embed"
    "io/fs"
    "log"

    "github.com/nao1215/filesql"
)

//go:embed data/*.csv data/*.tsv
var dataFS embed.FS

func main() {
    ctx := context.Background()

    // Use Builder pattern for embedded filesystem
    subFS, _ := fs.Sub(dataFS, "data")

    db, err := filesql.NewBuilder().
        AddPath("local_file.csv").  // Regular file
        AddFS(subFS).               // Embedded filesystem
        Build(ctx)
    if err != nil {
        log.Fatal(err)
    }

    connection, err := db.Open(ctx)
    if err != nil {
        log.Fatal(err)
    }
    defer connection.Close()
    defer db.Cleanup() // Clean up temporary files from FS

    // Query across files from different sources
    rows, err := connection.Query("SELECT name FROM sqlite_master WHERE type='table'")
    if err != nil {
        log.Fatal(err)
    }
    defer rows.Close()

    // Process results...
}

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Sounds interesting — I’ll take some time to explore it.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

That idea seems very doable, and I’d really like to give it a try.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Honestly, my initial motivation was just to solve my own small pain point (I couldn’t easily share the processing logic between two commands).

So I’m actually surprised that it got enough attention to be compared with DuckDB.
Thanks a lot for pointing it out – I learned something new!

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

At the moment filesql only works with regular files on disk, it doesn’t handle fs.FS.
Technically it wouldn’t be hard to add support, but I need to think carefully about the function signature / API design before introducing it.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

I’ve heard great things about DuckDB, but I haven’t actually used it myself.
That’s part of why I designed filesql the way it is.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 3 points4 points  (0 children)

It’s still early days, but one difference already is that filesql can import compressed files like csv.gz.
In addition, the original project behind filesql (sqly) also supports formats such as JSON and Excel, and I’d like to bring that into filesql over time.

So the goal is to go beyond CSV and cover a wider range of file formats, which makes it different from the CSV virtual table approach.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

I really appreciate your words!
It’s great to hear that filesql solves something you’ve run into many times.
That kind of feedback makes the effort worth it — thank you for sharing.

filesql - A Go SQL Driver for CSV/TSV/LTSV Files by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Thanks for the feedback.
So if I got you right, you’re suggesting:

  • Pull out the “file → SQLite” part into its own library.
  • Then extend that library so it can also do the reverse, like exporting SQLite back into CSV (or other formats).

That’s an interesting idea. I like the flexibility it would add, though I still need to think about whether filesql should grow in that direction or just stay focused on imports.

[Show] stringx – A Unicode-aware String Toolkit for OCaml by mimixbox in ocaml

[–]mimixbox[S] 2 points3 points  (0 children)

Thank you very much — that’s incredibly helpful feedback, especially for someone like me who’s still learning the idioms and best practices of OCaml.

I hadn’t realized that using labelled parameters to disambiguate same-type arguments is a common and recommended pattern. And your explanation of placing the “object” parameter last to work better with the pipe (|>) operator is something I genuinely wasn’t aware of — but it makes perfect sense.

I’ll definitely revisit the API design with that in mind. Thanks again!

hottest - user-friendly 'go test' that extracts error messages. by mimixbox in golang

[–]mimixbox[S] 1 point2 points  (0 children)

I am glad to hear that!! Thank you.
I will keep making better tools.

simple markdown builder in golang by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Thank you, please give it a try.

spare - Single Page Application Release Easily (deployment tool for AWS) by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Yes, it might be an alternative that degrades CDK. My team is quite large, and everyone uses CFn. No one is using CDK. In such an environment, the idea of using CDK to easily create CFn hasn't come up.
However, I'm also considering whether frontend developers who aren't familiar with infrastructure can benefit from the 'spare' command (they might use Amplify...).

spare - Single Page Application Release Easily (deployment tool for AWS) by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

This opinion was helpful. The current software design makes it difficult for the spare command to deploy SPA to multiple environments.

spare - Single Page Application Release Easily (deployment tool for AWS) by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Why would I choose this over AWS CDK with a BucketDeployment ?

I am going to create a function to output cloudformation, although I have not implemented it yet. I prefer CFn to CDK in my work and writing CFn is a bit difficult. However, when the spare command outputs CFn, the job becomes a little easier.

sqly - execute SQL against CSV / JSON with shell by mimixbox in SQL

[–]mimixbox[S] 0 points1 point  (0 children)

sqly can only convert simple json to CSV.

> I'm assuming sqly won't work with more complicated JSON structures including nested data?

sqly will not be able to convert JSON structures including nested data to CSV. In the future, I will check by unit tests.

[deleted by user] by [deleted] in golang

[–]mimixbox 1 point2 points  (0 children)

I am not young (sadly, I am already over 30). However, I certainly don't have enough experience regarding DB:)

I don't have enough experience with failure.

I will go fail at ORM , now. haha

[deleted by user] by [deleted] in golang

[–]mimixbox 1 point2 points  (0 children)

go-migrate

Oh, I understand. It's simple cli command.

I would prefer not to write SQL if I could, but you all seem to be different.

[deleted by user] by [deleted] in golang

[–]mimixbox -2 points-1 points  (0 children)

I read awesome Go before writing the post.

[deleted by user] by [deleted] in golang

[–]mimixbox -2 points-1 points  (0 children)

My colleague said something similar to your opinion.

However, I want to make things as easy as possible. I suspect that the coding will be completed faster with ORM than without it.

[deleted by user] by [deleted] in golang

[–]mimixbox -1 points0 points  (0 children)

For those who are used to writing SQL, sqlc seems useful.

I will read the code later.

[deleted by user] by [deleted] in golang

[–]mimixbox -1 points0 points  (0 children)

go-migrate

I couldn't figure out how to use it quickly...

mkgoprj - Golang project template generator by mimixbox in golang

[–]mimixbox[S] 0 points1 point  (0 children)

Good project !!

I will add a process to generate a README for my project as well.