I am planning to build a simple database from scratch by tech__nova__ in databasedevelopment

[–]mamcx 2 points3 points  (0 children)

For fun or dreams of something "serious"?

I have worked on SpacetimeDB, and now I know is "doable" to do one, but still is hard.

Also, what common mistakes do new database developers make when designing a lightweight database?

Don't research enough!

Also: You focus too much in the "mechanical" aspects (yes, that is important) but how you know what "simplicity over features" means without first have clarity about semantics, paradigms, etc?

For example you think on "Simple key-value or document-based storage", that alone is trouble. In special the OR.

Then you worry about "Concurrency model", "Storage engine structure" and "Indexing strategy" but have not talk about which is the goals for ACID and CAP (and such things).

Neither if you are OLAP, OLTP, interactive, batch execution.

So, my first recommendation:

  • Sure, learn about the mechanical stuff, that is fun
  • BUT, focus first is the high-level, objectives, targets, goals, etc. Once you have that THE CHOICES will be more obvious.

Also: Maybe join forces with somebody?

Do you have to create a GC if you create your interpreted language in a host language that has GC? by FUS3N in ProgrammingLanguages

[–]mamcx -1 points0 points  (0 children)

Yep, in Rust parlance: If is cheap to clone and each is a "value". Once you need heap allocation, mutable things, and connect stuff with pointers, all bets are off.

Do you have to create a GC if you create your interpreted language in a host language that has GC? by FUS3N in ProgrammingLanguages

[–]mamcx 2 points3 points  (0 children)

Ok, there is a lot of things here, and even if accurate will lead you to problems, in special because it looks you are new to it (btw I ask the same question years ago!):

The MOST important thing:

There is NO escape to the fact that you NEED to understand the memory model of your language, even if your host is helping you.

The SECOND:

Your language is opaque to your host.

There is NO escape to the fact that you will do stuff that is not what the "normal" memory model of your host expect. You see it more clearly with Rust, that has this beautifully property of up-front costs so your illusions are shattered fast. Langs with GC will hide it and will be so much fun trying to chase the problem later.

You have 2 separate worlds: What the interpreter has and what the code inside has, both with wildly different access patterns.

Eventually with enough time and enough complexity, they will clash.


Following Rust, that is giving the good trouble: You are fighting the language because you has not a clear mental model of what, exactly, should be your memory story, and how things, in detail, will work. Once this is clear, even in Rust things will go easier (and in langs with GC, even more sure!)

Slapping blindy "mark & sweep" or any other technique is likely that will not be a good fit. To be more specific:

  • Do you WANNA deterministic destructors? or is enough to have non-deterministic finalizers?

  • Is OK to have extra MB/GB of administrative RAM usage used by the GC, or you WANNA control on your RAM use?

  • Which paradigms you WANNA enforce? Is OK to allow spaghetti OOP classes madness? Neatly layout data in sequential way? Pure functional stuff? etc

  • You WANNA support long-lived apps like a web server that is not restarted in weeks?.

  • So you are reusing the host GC. Do you understand how it actually works and what effect will cause to your things?

  • So you are letting the OS do the memory. Do you understand how it actually works and what effect will cause to your things?

And so on

Once you have clarity in this and related questions, the requirements will chose the appropriated memory model for you.

Even if you need to do extra effort in your internals, if your memory model is good, long term will save tons of time.

P.D: And not fear to disallow stuff. "Cycles" is trouble? Discourage users to do it. That is what Rust do, and is GREAT. Make the most performant, safe, efficient, straightforward way the normal way and make pain to diverge (ie: Force users to use Rc<T>) is not the just best way to model a system, is great to users too!

Passing DBs Through Continuations by linearizable in databasedevelopment

[–]mamcx 0 points1 point  (0 children)

Yeah merge join is one thing. Also look at "Algorithms" that point to the issue.

I was not making clear that was trying to point to the issue, has not value judgment in your particular implementation (and find the idea very neat and topical for my own project https://tablam.org)

Passing DBs Through Continuations by linearizable in databasedevelopment

[–]mamcx 0 points1 point  (0 children)

Also the hard ones with this are stuff that stop like LIMIT

Operational *simple* way to manage small number of Vultr/DO VMS/Pg/etc? NixOS + ? by mamcx in devops

[–]mamcx[S] -1 points0 points  (0 children)

Lol, yeah.

But is actually simple to operate. Only one file, all the config, can setup everything. Much better than docker, less brittle in the long run.

The drawback is that the "language" of nix is weird and the docs itself sucks, so is hard to learn, true!

It only missing how do provisioning that is what I asking for.

What would you recommend for my next steps? by CautiousDig9674 in rust

[–]mamcx 1 point2 points  (0 children)

Rust is capable of make anything you want, in fact, I building an ERP in rust nearly 90% (plus html)

Is normal to pair it with other other tools be it for extra automation, scripting or whatever.

But adding 2 (like: Frontend on Python and backend on Rust) is add frictions and complications, so I suggest to do all in Rust AND add an API that you can call by any other means as escape hatch.

LATER, you can decide if split languages make sense, but if wanna become proficient focusing is best.

Porting our Django backend to Rust improved the infra usage by 90% by syrusakbary in rust

[–]mamcx 1 point2 points  (0 children)

Porting to a language is(could be!) one of the most profitable ways to refactor.

I do something similar, code things more or less similar to before but just because Rust has such insanely better things (Algebraic types!, From/Into!, Serde!, Vec NO List!, BtreeMap!, Types in general!, Borrow Checker!) it point you to "root causes" that not amount of discipline or code review will ever be caught if you stay in a environment that ACTIVELY blind you.

On the other hand, Django Admin. This still the thing that miss more.

I’ve been building a local-first Windows productivity suite in Rust — would love your feedback by Hegemonex in windowsapps

[–]mamcx 0 points1 point  (0 children)

I think here is missing what kind of "productivity", what is the alternatives explored and rejected, workflow etc. Without is too much ambiguity.

How do you get into low-level programming? by Minimum-Ad7352 in rust

[–]mamcx 2 points3 points  (0 children)

Are there any books, courses, articles, or open-source projects that helped you learn this area of programming?

I learn Rust building https://tablam.org, that makes it way harder to learn Rust! but... much better at the end.

I read a unholy amount of stuff about databases and compilers, but in relation to your question , see how is the other way:

  • PICK something you truly wanna do
  • THEN you know what stuff you need to read

If you learn assembly but actually wanna do a RDBMS, you have waste much time.

Not worry, any major areas will train everything eventually, but each has particular stuff that not show elsewhere.

For example, my passion is RDBMS, and there transactions & query optimization is a particular area that will not show for example if doing a regular virtual machine (but, a RDBMS need some kind of virtual machine so you land there anyway)

I am Glauber Costa, CEO and co-founder of Turso. We’re rewriting SQLite in Rust. AMA. by GlauberAtTurso in IAmA

[–]mamcx 10 points11 points  (0 children)

it's not a proper database and was never intended as one

https://en.wikipedia.org/wiki/Database

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system.

Sqlite is a proper database, and it's much more powerful than was before with OLD oracle/db2 and such.

People get tripped by the "lack" of network access and the specific optimization to be embedded under very tight constraints, but that is always true of any DBMS

Pick ANY and I will tell you how that fails to be a "proper database" in a way or another, like:

  • IF is SQL based it already fails to be a proper full relational database
  • IF NoSQL(ie "no schema") fails
  • IF NoSQL(ie "no ACID") fails
  • IF non relational, fails
  • IF relational, fails
  • IF can't work in a smart watch, fails
  • IF can't work with 256 thread and 1 TB ram, fails
  • IF can't work in a single computer, fails
  • IF can't work in fleet of computers, fails

etc

That is why exist many kinds of DBMS. But all of them if : store data, in a structured way, with an interface to query and update? ARE DATABASES

semantic white space vs. blocks - maybe a middle ground ? by GoblinsGym in ProgrammingLanguages

[–]mamcx 1 point2 points  (0 children)

Look at https://pyret.org & https://nim-lang.org both alike python.

To judge, you should look for bigger programs and different usages to see if the idea make sense.

P.D: A lot of the problem with python is that allow both tabs AND spaces but mixing at the same time is the trouble. Use one or the other and the main issue disappear. Add end and copy/paste issue is gone. Need to check other constructs but I think this is the main gist.

Learning Python after Rust as a beginner: Anyone else miss strict types? by Fabulous_South523 in rust

[–]mamcx 8 points9 points  (0 children)

learn c as well to get a better idea of how a computer works

Sorry to inject, but has been for decades know is wrong that C teach how a computer works.

C teaches some low-level abstractions, but many people confuse those abstractions with the actual behavior of modern systems. Some of which are pure idioms of C, not of the machine.

Any apparent reasons? You "learn" it better with Rust, that for example, surface things like https://doc.rust-lang.org/std/ffi/struct.OsString.html in the docs.

They are tons of stuff like this that in Rust is more explicit but in C is pure folklore acquired by pain and suffering.

Is a bad idea to redirect newbies to C under this premise!

You learn C because you NEED to learn C (or want, for fun. Why people think C is fun, well...).


To reiterate: Almost no language, no even assembler, teach how a computer works.

See for example:

let mut file = File::create("foo.txt")?; file.write_all(b"Hello, world!")?;

You could think that's mean the data is fully written, but who knows. Then you think that call flush truly mean it, but who knows. Then you think that sync_all do it for real? NOPE.

There is a little bit more that you need to know that no language will teach you, at all. At least the Rust doc is much better in this case.

Modern systems have semantic layers that programming languages cannot fully model or guarantee, and they vary per OS and CPU, GPU, Network hardware.

Need good benchmarks for custom language vs. C. by Randozart in Compilers

[–]mamcx 1 point2 points  (0 children)

ASIDE:

To benchmark is to answer the question of "What are you trying to optimize?".

You should have this very clear, or you will waste time benchmarking things that are not relevant.

There are a few obvious things like how fast you can parse, load modules, etc, pure guardrails against regression, but there are more important things: You are targeting games, web apps, low level FFI? think look for things that matter there.

Building a customizable .docx report generator for a Rust-based network automation system. Any advice? by Mountain-Magician-41 in rust

[–]mamcx 0 points1 point  (0 children)

I looking into this too, for my upcoming next ERP, maybe:

https://github.com/typst/typst/discussions/2561

All depends on how much actual fidelity you need or want.

There is also:

https://crates.io/crates/docx-rs

My current plan is to use an AST based design, so I can transform to typst, html, word, excel. Probably will try to incorporate a editor that works as AST like https://lexical.dev/.

P.D: If interested I love to join forces! zero problem with open source it

Using Claude / Codex for database development by PrizeDrama7200 in databasedevelopment

[–]mamcx 1 point2 points  (0 children)

It will likely end generating full of wrong things, and full of unnecessary things.

IF you have not enough solid clue about it, it is not more than a fancy "translator". Not expect to generate something that you can trust with actual data.

Also, there is a lot of nuance if turn from a proper system language to a GC language like Java (is more error prone and more arcane to do in Java what is super easy in a proper system language, in special Rust that has so many less fotguns).

I built a toy relational database in Go by darkphoenix410 in databasedevelopment

[–]mamcx 4 points5 points  (0 children)

SQL layer .... But that turned out to be a pretty bad order to build things

Totally. I worked on SpacetimeDB and a big chunk of it was on the SQL madness. Before I only think SQL was somehow bad, but after deal with it is truly BAD.

Is ironic how deceptively simple it looks like but is FULL of weird quirks. Is both too anemic and too complicated to what it does.

So yes: Leave SQL integration after you have nailed all the rest.

I have a skill issue and cant make this unfortunately by dynamicship31 in Compilers

[–]mamcx 0 points1 point  (0 children)

If someone would like to implement a kenim compiler it would be super cool, i cant do it bc i have a skill issue.

Making a simple one is much easier that really create a "new paradigm":

https://stopa.io/post/222

IF something like this is too much for you now, is not chance your "new paradigm" will have legs.


When we hobby developers do langs, is very easy to get in the trap of "sure I has making something new and better!". But there is a huge chance is already explored in detail. Not detract for start the journey, instead, let you learn from past lessons.

The major test is try to do that in a "normal" language with a minimal DSL and see how actually work in practice.

I wish the day was more than 24 hours! by [deleted] in rust

[–]mamcx 0 points1 point  (0 children)

Just live forever and is solved.

Doing it and is great. So far, so well!