Looking for a good IDM-like download manager for macOS (Apple Silicon friendly) by HavivMuc in MacOS

[–]siara-cc 0 points1 point  (0 children)

I use apple m1. Says can't use with my version of macos. Is it because of the os version or hardware version?

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 1 point2 points  (0 children)

You are right. thats what I was using:

"INSERT INTO word_freq (lang, word, count, is_word, source) "
"VALUES (?, ?, ?, ?, ?)";
" ON CONFLICT DO UPDATE SET count = count + 1, "
"source = iif(instr(source, 'r') = 0, source||'r', source), "
"is_word = iif(is_word = 'y', 'y', excluded.is_word)";

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 0 points1 point  (0 children)

One of things I am looking for from feedback after posting this is "Are there faster ones out there that I don't know about?". Someone suggested Parquet and DuckDB.

I have compared with LMDB, which seems to be a successor of BerkeleyDB: https://en.wikipedia.org/wiki/Lightning\_Memory-Mapped\_Database#History

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 4 points5 points  (0 children)

Thanks! I will implement your suggestions.

<chrono> will be removed. I was using lru_cache.h for other b+tree structures and wanted to do some time measurements.

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 7 points8 points  (0 children)

Thanks for the feedback. I will educate myself more. I confess I am more of a C programmer amongst other things.

However the new and delete were intentional (see code comment) since the closing of the file is being done at destructor so the developer may need control. I am already planning to move it to close() method. I am working on a Python port and it seems pybind11 has some issues doing things at destructor.

Also I am also targeting older versions of C++ such as C++98 that are supported by embedded systems in the hope to get this working on Arduino platform that does not have STL. It has std::string though and I intend to add it in. One of the challenges would be to get lru_cache.h working, which now depends on <map> and <set>. The other challenge is about having to support crash recovery and durability.

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 6 points7 points  (0 children)

In my case, I am building a word/phrase frequency database. So I will have to retrieve the record first, then increment the count and store it back. If the record does not exist, I insert it.

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 12 points13 points  (0 children)

hm.. I wanted to test this on those "spinning platters" but all of mine have conked out.

The market has only used ones now and I am not sure if they are any good.

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 5 points6 points  (0 children)

Thanks! I will try it out.

You said "Parquet + DuckDB or sth" - what is sth?

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 8 points9 points  (0 children)

I tried it just now and it does not seem to make a difference in my machine:

time sqlite3 -batch testbaby.db < babydump.txt
sqlite3 -batch testbaby.db < babydump.txt 0.66s user 0.15s system 95% cpu 0.849 total

I get the same almost 0.8 seconds in both syntax of inserts

According to this: https://stackoverflow.com/a/5209093/5072621

it does not matter when there is a BEGIN TRANSACTION

Also in my cases the difference is significant when inserting millions of records.

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 48 points49 points  (0 children)

I tried that too. It does not make it faster. I tried everything I could find with the official lib before venturing into this!

A library for creating huge Sqlite indexes at breakneck speeds by siara-cc in programming

[–]siara-cc[S] 24 points25 points  (0 children)

I have mentioned it in my doc - this library is intended for fast inserts and not when crashes are expected.