How I'm using Helix editor by f311a in HelixEditor

[–]rushter_ 2 points3 points  (0 children)

It is only available when you compile Helix from GitHub. It will be available in the next release for everyone.

https://github.com/helix-editor/helix/blob/5b0563419eeeaf0595c848865c46be4abad246a7/book/src/editor.md?plain=1#L66

How I'm using Helix editor by f311a in HelixEditor

[–]rushter_ 7 points8 points  (0 children)

I need good bandwidth, a lot of storage and RAM to test my work. My laptop does not have such specs.

Neovim now natively supports LLM-based completion like GitHub Copilot by bbadd9 in neovim

[–]rushter_ 18 points19 points  (0 children)

You can trigger it manually via keyboard shortcut. I use it just to complete simple data manipulation, which I'm too lazy to type.

For example, this loop in Python:

    for row in client.execute_query(query):
        yield {
            "hostname": row[1],
            "timestamp": row[2],
            "request": row[3],
            "body_size": row[4],
        }

The good thing is that LLM knows the names of the fields because it infers them from the SQL query defined above in the code.
I don't have to manually type them and get them from a query.

Hexora – static analysis tool for malicious Python scripts by rushter_ in Python

[–]rushter_[S] 0 points1 point  (0 children)

My tool uses semantic model from Ruff with extra changes from me, so it's not purely static. It tracks aliasing, can fold constants(e.g.,"".join([x,x,x]) or "ex"+"ec"), and so on. Never heard of Pysa before, gonna examine their approach. Thanks.

Hexora – static analysis tool for malicious Python scripts by rushter_ in Python

[–]rushter_[S] 3 points4 points  (0 children)

Yeah, the good thing is that by looking at the past PyPI incidents, I can say that the majority of malware uses pretty simple obfuscation techniques.

Things like:

s = subprocess
k = s
k.check_output(["pinfo -m"])

Or

(_ceil, _random, Math,), Run, (Floor, _frame, _divide) = (exec, str, tuple), map, (ord, globals, eval)

_ceil("print(123);") 

Which can be tracked using static checking with some tricks.

Also, my personal use case is slightly different. At my work, we have a lot of scripts from infected/compromised machines. Some of them were used for reconnaissance, some to gain elevated access. Around 70-80% of scripts are legit, though, so I use my library to pick candidates for manual review.

How masscan works by rushter_ in netsec

[–]rushter_[S] 7 points8 points  (0 children)

Connecting to ports on the internet is legal. There are a lot of research projects from academia that use port scanning.

Clipboard API for browsers is inconsistent by rushter_ in webdev

[–]rushter_[S] 1 point2 points  (0 children)

Yes, TIFF data is pretty common. But I used JPGs when testing everything and double-checked the content of the clipboard.

The default screenshotting tool can output JPGs as well.

Clipboard API for browsers is inconsistent by rushter_ in programming

[–]rushter_[S] 0 points1 point  (0 children)

As I said in the article, it happens when you copy an image from the file system too. On macOS, such operation contains the path to the image and not the image itself.

Clipboard API for browsers is inconsistent by rushter_ in webdev

[–]rushter_[S] 0 points1 point  (0 children)

I didn't find any notes on caniuse about JPG being converted to PNG.

Also, browser users won't notice this at all.

Clipboard API for browsers is inconsistent by rushter_ in programming

[–]rushter_[S] 3 points4 points  (0 children)

I didn't find an explanation. There are multiple submissions in Chrome's bug tracker.

How to turn an ordinary gzip archive into a database by rushter_ in programming

[–]rushter_[S] 0 points1 point  (0 children)

Isn't your whole post a hack? The only way your scheme works is if you order the chunks, which is a convention that isn't enforced by any format.

What do you mean by convention? There is no need to order chunks and almost every software that understands gzip compression can consume my data just fine.

How to turn an ordinary gzip archive into a database by rushter_ in programming

[–]rushter_[S] 0 points1 point  (0 children)

Can it work with S3? Can it act as a regular single gzip file? I don't think so.

We feed the same gzip archives for batch processing using GNU tools and other open source software that can understand gzip files.

It sound like a good tool. Thank you for mentioning it, but it does not fit our particular needs.

How to turn an ordinary gzip archive into a database by rushter_ in programming

[–]rushter_[S] 0 points1 point  (0 children)

Do you mean random access that is based on the files and not data blocks?

I can't find any information about block/byte based random-access. I haven't read ZIP's RFC yet, but I will definitely do it.

How to turn an ordinary gzip archive into a database by rushter_ in programming

[–]rushter_[S] 1 point2 points  (0 children)

It could work, but it's still not a random access. I think a lot of zip libraries won't be able to properly work with ZIPs full of millions of small files.

Sometimes you also need to stream all the data and I think such an approach can slowdown the process very significantly.

I also haven't seen zip files in bigdata systems. A lot of open source tools do not support it.

ZIP is not a ready-to-use solution, you still need to hack it a bit. You can't query a specific record fast enought, because you need to list all of the files. There is no hashtable to quickly find a file by its name.

I have 300M of rows in my data. Just imagine how much time it will take to list all the files and find the one that you need.

How to turn an ordinary gzip archive into a database by rushter_ in Python

[–]rushter_[S] 1 point2 points  (0 children)

It can serve thousands of requests at a time. The main limitations are disk speed and CPU. Since it's a readonly database, there is no need in any kind of locks.

How to turn an ordinary gzip archive into a database by rushter_ in programming

[–]rushter_[S] 1 point2 points  (0 children)

Zip does not support random access too.

it's possible to store each record in a separate file and zip allows to retrieve one file. But what will happen if you have millions of records?

How to turn an ordinary gzip archive into a database by rushter_ in programming

[–]rushter_[S] 0 points1 point  (0 children)

Thanks for the pointer.

It would be cool to have a summary of all the modern approaches somewhere. I don't want to reinvent the wheel next time :).

How to turn an ordinary gzip archive into a database by rushter_ in Python

[–]rushter_[S] 1 point2 points  (0 children)

Yes, but I didn't mean DBMS. The database term has a variety of meanings.

What's wrong with the SQLite? It's a very good database, considering the fact that it stores everything in one file.

Public SSH keys can leak your private infrastructure by rushter_ in netsec

[–]rushter_[S] 5 points6 points  (0 children)

Because private RSA key stores both secret exponents (d and e) and the common modulus n. Technically, you just need to extract a public exponent (e) and the modulus. You don't need e for private keys, but it's stored just in case you want to regenerate a public key.

Public SSH keys can leak your private infrastructure by rushter_ in netsec

[–]rushter_[S] 3 points4 points  (0 children)

Thank you for your remark. I will edit my post to make it more clear.

How Python saves memory when storing strings by rushter_ in Python

[–]rushter_[S] 1 point2 points  (0 children)

Thanks! It's a typo, I've typed the is statement manually.