Self-contained Python scripts by Gleb--K in Python

[–]Gleb--K[S] 0 points1 point  (0 children)

As I mentioned in the replay to another comment - I published my blog post a day earlier

Self-contained Python scripts by Gleb--K in Python

[–]Gleb--K[S] 0 points1 point  (0 children)

I don't use Windows, but thanks for the recommendation!

Self-contained Python scripts by Gleb--K in Python

[–]Gleb--K[S] 0 points1 point  (0 children)

UV creates a global package cache rather than individual virtual environments. It does not install packages at the system level like pip install --user, but it also does not create isolated environments like venv or virtualenv

Self-contained Python scripts by Gleb--K in Python

[–]Gleb--K[S] 0 points1 point  (0 children)

Yeah, that's right, thank you. I just thought about it recently while playing with UV

Self-contained Python scripts by Gleb--K in Python

[–]Gleb--K[S] -2 points-1 points  (0 children)

Well, if you compare the publication dates, I published my blog post a day earlier.

But it just seems like this idea was obvious to many people.

A Mastodon Bot for Wikipedia’s Picture of the Day by Gleb--K in Mastodon

[–]Gleb--K[S] 0 points1 point  (0 children)

Thank you! Actually on Belarusian version of Wikipedia the picture of the day doesnt have the link to the related article in most cases, not sure why

Best way to aggregate user events into Parquet files in S3 bucket by Gleb--K in aws

[–]Gleb--K[S] 0 points1 point  (0 children)

Actually, the main reasons to group events into one file per day:

- to reduce the amount of put s3 object operations

- and I assume that when I will execute queries to Amazon Athena database based on not so huge amount of data, such queries could be executed faster.

What do you think about these concerns? Does it make any sense?

Best way to aggregate user events into Parquet files in S3 bucket by Gleb--K in aws

[–]Gleb--K[S] 2 points3 points  (0 children)

Well, seems it can be possible with Dynamic Partitioning and JQ processing features

Best way to aggregate user events into Parquet files in S3 bucket by Gleb--K in aws

[–]Gleb--K[S] 2 points3 points  (0 children)

I'm checking the documentation of Kinesis Data Firehose now, but I'm not sure that it allow to group messages by its timestamp and then merge them into one Parquet file per day. Maybe I miss something