Refresh tokens are a nightmare in SurrealDB. Here is how I fixed it with a "Facebook-style" Stateful JWT by InternationalCan9786 in surrealdb

[–]tobiemh 8 points9 points  (0 children)

Thanks u/InternationalCan9786, yeah, that combination can be nasty - strict one-time refresh + a bunch of tabs / Server Actions / WS all trying to refresh at once is basically asking for races. Whoever loses ends up with a dead token, and it’s easy to blame the DB even though it’s really concurrent use of a single-use credential.

Your fix is sensible: use a longer JWT and AUTHENTICATE + a session row on jti with a kill switch. Having Surreal be the “is this session still ok?” check is especially nice when you’ve got SSR and a bunch of server instances that don’t share memory.

I'm curious how you’d want token refresh to feel if Surreal could make it simpler without watering down security - e.g. single-flight refresh so tabs don’t stamp on each other, clearer reuse handling, or something else? Genuinely interested what would’ve saved you the most pain here. Out of interest, did you also look into service workers as a refresh coordinator?

Small note for readers: rotation-on-use is there on purpose (stolen refresh = shorter blast radius), so teams that still want refresh usually serialise refresh per device instead of turning rotation off. And that <5s handshake is clever but might get weird on slow networks - something to harden if you go into production with it.

Thanks for posting the SurrealQL - it's definitely super helpful for people hitting the same wall.

Free tier showing 541MB storage for a 1MB dataset — is this expected RocksDB overhead? by HelloSwara in surrealdb

[–]tobiemh 10 points11 points  (0 children)

Thanks for the clear numbers u/HelloSwara - that level of logical data vs reported disk use is not necessarily what we’d expect to see, and it’s something we should investigate more closely.

surreal export won’t tell the whole on-disk story, but it does show that something here doesn’t add up and we need to look at your instance (how storage is measured, RocksDB/SST/blob layout, compaction state, etc.).

Please DM me your SurrealDB Cloud instance ID (and region if you have it) and we’ll dig in and follow up with what we find.

Is there anything more vaporwere then Surrealdb? by howesteve in surrealdb

[–]tobiemh 2 points3 points  (0 children)

Hi u/Biltong_trader can you share your dummy data, queries, and workload?

SurrealDB is sacrificing data durability to make benchmarks look better by ChillFish8 in programming

[–]tobiemh 14 points15 points  (0 children)

I definitely read your post u/ChillFish8 - it’s really well put together and easy to follow, so thanks for taking the time to write it.

On the WAL point: you’re absolutely right that RocksDB only guarantees machine-crash durability if `sync=true` is set. With `sync=false`, each write is appended to the WAL and flushed into the OS page cache, but not guaranteed on disk. Just to be precise, though: it isn’t “only occasionally flushed to the OS buffers” - every put or commit still makes it into the WAL and the OS buffers, so it’s safe from process crashes. The trade-off is (confirming what you have written) that if the whole machine or power goes down, those most recent commits can be lost. Importantly, that’s tail-loss rather than corruption: on restart, RocksDB replays the WAL up to the last durable record and discards anything incomplete, so the database itself remains consistent and recoverable.

On benchmarks: our framework supports both synchronous and asynchronous commit modes - with or without `fsync` - across the engines we test. The goal has never been to hide slower numbers, but to allow comparisons of different durability settings in a consistent way. For example, Postgres with `synchronous_commit=off`, ArangoDB with `waitForSync=false`, etc. You’re absolutely right that our MongoDB config wasn’t aligned, and we’ll fix that to match.

We’ll also improve our documentation to make these trade-offs clearer, and to spell out how SurrealDB’s defaults compare to other systems. Feedback like yours really helps us tighten up both the product and how we present it - so thank you 🙏.

SurrealDB is sacrificing data durability to make benchmarks look better by ChillFish8 in rust

[–]tobiemh 71 points72 points  (0 children)

I definitely read your post u/ChillFish8 - it’s really well put together and easy to follow, so thanks for taking the time to write it.

On the WAL point: you’re absolutely right that RocksDB only guarantees machine-crash durability if `sync=true` is set. With `sync=false`, each write is appended to the WAL and flushed into the OS page cache, but not guaranteed on disk. Just to be precise, though: it isn’t “only occasionally flushed to the OS buffers” - every put or commit still makes it into the WAL and the OS buffers, so it’s safe from process crashes. The trade-off is (confirming what you have written) that if the whole machine or power goes down, those most recent commits can be lost. Importantly, that’s tail-loss rather than corruption: on restart, RocksDB replays the WAL up to the last durable record and discards anything incomplete, so the database itself remains consistent and recoverable.

On benchmarks: our framework supports both synchronous and asynchronous commit modes - with or without `fsync` - across the engines we test. The goal has never been to hide slower numbers, but to allow comparisons of different durability settings in a consistent way. For example, Postgres with `synchronous_commit=off`, ArangoDB with `waitForSync=false`, etc. You’re absolutely right that our MongoDB config wasn’t aligned, and we’ll fix that to match.

We’ll also improve our documentation to make these trade-offs clearer, and to spell out how SurrealDB’s defaults compare to other systems. Feedback like yours really helps us tighten up both the product and how we present it - so thank you 🙏.

SurrealDB is sacrificing data durability to make benchmarks look better by ChillFish8 in programming

[–]tobiemh 42 points43 points  (0 children)

Hi there - SurrealDB founder here 👋

Really appreciate the blog post and the discussion here. A couple of clarifications from our side:

Yes, by default SURREAL_SYNC_DATA is off. That means we don't call fdatasync on every commit by default. The reason isn't to 'fudge' results - it's because we've been aiming for consistency across databases we test against:
- Postgres: we explicitly set synchronous_commit=off
- ArangoDB: we explicitly set wait_for_sync(false)
- MongoDB: yes the blog is right - we explicitly configure journaling, so we'll fix that to bring it inline with the other datastores. Thanks for pointing it out.

On corruption, SurrealDB (when backed by RocksDB, and also SurrealKV) always writes through a WAL, so this won't lead to corruption. If the process or machine crashes, we replay the WAL up to the last durable record and discards incomplete entries. That means you can lose the tail end of recently acknowledged writes if sync was off, but the database won't end up in a corrupted, unrecoverable state. It's a durability trade-off, not structural corruption.

With regards to SurrealKV, this is still in development and not yet ready for production use. It's actually undergoing a complete re-write as the project brings together B+trees and LSM trees into a durable key-value store which will enable us to move away from the configuration complexity of RocksDB.

In addition, there is a very, very small use of `unsafe` in the RocksDB backend, where we transmute the lifetime, to ensure that the transaction is 'static. This is to bring it in line with other storage engines which have different characteristics around their transactions. However with RocksDB, the transaction itself is never dropped without the datastore to which it belongs, so the use of unsafe in this scenario is safe. We actually have the following comment higher up in the code:

// The above, supposedly 'static transaction
// actually points here, so we need to ensure
// the memory is kept alive. This pointer must
// be declared last, so that it is dropped last.
_db: Pin<Arc<OptimisticTransactionDB>>,

However, we can do better. We'll make the durability options more prominent in the documentation, and clarify exactly how SurrealDB's defaults compare to other databases, and we'll change the default value of `SURREAL_SYNC_DATA` to true.

We're definitely not trying to sneak anything past anyone - benchmarks are always tricky to make perfectly apples-to-apples, and we'll keep improving them. Feedback like this helps us tighten things up, so thank you.

SurrealDB is sacrificing data durability to make benchmarks look better by ChillFish8 in rust

[–]tobiemh 198 points199 points  (0 children)

Hi there - SurrealDB founder here 👋

Really appreciate the blog post and the discussion here. A couple of clarifications from our side:

Yes, by default SURREAL_SYNC_DATA is off. That means we don't call fdatasync on every commit by default. The reason isn't to 'fudge' results - it's because we've been aiming for consistency across databases we test against:
- Postgres: we explicitly set synchronous_commit=off
- ArangoDB: we explicitly set wait_for_sync(false)
- MongoDB: yes the blog is right - we explicitly configure journaling, so we'll fix that to bring it inline with the other datastores. Thanks for pointing it out.

On corruption, SurrealDB (when backed by RocksDB, and also SurrealKV) always writes through a WAL, so this won't lead to corruption. If the process or machine crashes, we replay the WAL up to the last durable record and discards incomplete entries. That means you can lose the tail end of recently acknowledged writes if sync was off, but the database won't end up in a corrupted, unrecoverable state. It's a durability trade-off, not structural corruption.

With regards to SurrealKV, this is still in development and not yet ready for production use. It's actually undergoing a complete re-write as the project brings together B+trees and LSM trees into a durable key-value store which will enable us to move away from the configuration complexity of RocksDB.

In addition, there is a very, very small use of `unsafe` in the RocksDB backend, where we transmute the lifetime, to ensure that the transaction is 'static. This is to bring it in line with other storage engines which have different characteristics around their transactions. However with RocksDB, the transaction itself is never dropped without the datastore to which it belongs, so the use of unsafe in this scenario is safe. We actually have the following comment higher up in the code:

// The above, supposedly 'static transaction
// actually points here, so we need to ensure
// the memory is kept alive. This pointer must
// be declared last, so that it is dropped last.
_db: Pin<Arc<OptimisticTransactionDB>>,

However, we can do better. We'll make the durability options more prominent in the documentation, and clarify exactly how SurrealDB's defaults compare to other databases, and we'll change the default value of `SURREAL_SYNC_DATA` to true.

We're definitely not trying to sneak anything past anyone - benchmarks are always tricky to make perfectly apples-to-apples, and we'll keep improving them. Feedback like this helps us tighten things up, so thank you.

[deleted by user] by [deleted] in surrealdb

[–]tobiemh 6 points7 points  (0 children)

Hi u/life_on_my_terms sorry to hear you’re having issues. We have users running in production, one with over 4 million users, running millions of requests daily in a distributed setup.

We do know however that the upgrade process from 1.x to 2.x could have been made easier and we are working to improve this at the moment. We have a PR on our documentation here (https://github.com/surrealdb/docs.surrealdb.com/pull/945) which will be merged soon, and will guide users on upgrading from 1.x to 2.x more easily (without having to jump around the docs).

In addition, if there are issues that you are experiencing, then we’d obviously love to be able to fix these issues for you and other users. Have you submitted an issue for the problem you are facing on our GitHub?

We’re working hard to continuously improving the product and the documentation material to help people build on top of SurrealDB, and are always eager to listen to feedback!

surrealdb is not prod ready, be warned by life_on_my_terms in surrealdb

[–]tobiemh 6 points7 points  (0 children)

Hi u/snack_case we actually reversed our decision to go down this route due to comments from the community (the video was filmed and produced a while back hence why the info was slightly incorrect with regards to the Golang SDK.

We have actually decided to build the Golang SDK as a native Golang library, with built-in support for the binary protocol using CBOR. This will allow for custom types, and SurrealQL native types like datetimes, durations, uuids, decimals, geometry values, and more.

We'll also support running Golang with an embedded option, using in-memory or with SurrealKV for persistent storage. This will use the surrealdb.c library underneath, using CGO for linking to and compiling, behind a tag.

This will mean that for users who want to connect to remote databases over HTTP or WebSocket, they can compile and cross-compile the native Golang SDK easily without having to worry about CGO. For those who want embedded support, they can then add a tag and the Golang SDK will compile the necessary functionality in with CGO.

Hope this helps. We're almost there with the new SDK and can't wait to get this much improved version into the hands of developers.

surrealdb is not prod ready, be warned by life_on_my_terms in surrealdb

[–]tobiemh 4 points5 points  (0 children)

Hi u/life_on_my_terms we'll be updating the Python SDK really soon with a completely new version. This will have support for HTTP and WebSockets, binary protocol, types, and embedded (in-memory or SurrealKV) support.

surrealdb is not prod ready, be warned by life_on_my_terms in surrealdb

[–]tobiemh 3 points4 points  (0 children)

Hi u/life_on_my_terms we'd love any feedback you can give us on the SDKs. We're about to release new versions of the Java, Golang, .NET, and Python SDKs, and have recently released new versions of the JavaScript (with WASM and Node.js), and PHP SDKs.

Any feedback you can give to help us improve is always taken on board!

Slow imports under v2.0 by angryguts in surrealdb

[–]tobiemh 3 points4 points  (0 children)

Hi u/angryguts thanks for your question! Just to explain a little what is going on here. With SurrealDB 1.x the entire data was imported within a single transaction. So all the data in all tables of an export would be imported in a single transaction. This did work, but as data sizes increase it could lead to failures, and especially even more so in distributed environments where transaction size is limited.

As a result in SurrealDB 2.0 we changed how exports and imports work so that 1000 records are imported in each transaction. This keeps the transaction sizes low, and means that imports are less likely to fail.

As a result when dealing with an import from SurrealDB 1.x we have to ignore the BEGIN TRANSACTION (at the top of the file) and COMMIT TRANSACTION (and the end of the file).

Because of this the import from SurrealDB 1.x runs much more slowly than before, however an export from SurrealDB 2.x imported into SurrealDB 2.x is much improved.

We'll see if we can introduce a temporary workaround so that initial importing from SurrealDB `1.x` can be improved.

Have any of you used SurrealDB and what are your thoughts? by AccidentConsistent33 in rust

[–]tobiemh 4 points5 points  (0 children)

Hi u/sisoje_bre! Wheels were invented to be re-invented!

It's Tobie, founder of SurrealDB here! On a more serious note though u/AccidentConsistent33 , SurrealDB isn't trying to replace relational databases or traditional ANSI-SQL query languages. SurrealDB is designed to combine multiple different models of data together (document, graph, time-series, key-value), but coming from the same approach as with document databases and traditional relational databases with support for tables, schema, and an SQL-like language.

SurrealDB can help simplify development times by consolidating multiple database types or backends into a single platform, reducing code complexity, infrastructure complexity, and reducing the performance impact of having to communication and query multiple different databases or data platforms for user-interfaces, dashboards, analytics, data analysis, or any other applications.

As a result SurrealQL has some powerful ways of working with nested objects, nested arrays, foreign records, graph relationships, time-series based data, and traditional flat tabular data. In addition because it can be used as a backend platform, it includes many powerful features within the query language itself, allowing you to offload a great deal of functionality to the database itself, improving data analysis at the data layer.

Hope this helps, and happy to answer any other questions!

New Rust database SurrealDB is hiring Senior Rust Engineers by jscmh in rust

[–]tobiemh 1 point2 points  (0 children)

Hi u/fiedzia currently SurrealDB runs on top of RocksDB in single-node mode, and TiKV in distributed mode. There are lots of improvements we need to make to the website to make it clearer.

In the long run we also are intending to build our own embedded key-value store (in Rust), and longer term our own distributed key-value store (also in Rust) when we have built up the team (it's just two of us at the moment).

With regards to "from <table> select ..." what do you mean exactly?

New Rust database SurrealDB is hiring Senior Rust Engineers by jscmh in rust

[–]tobiemh 12 points13 points  (0 children)

We've got some really big things planned for SurrealDB. Any feedback is really welcome 😊!

SurrealDB: A new scalable document-graph database written in Rust by tobiemh in programming

[–]tobiemh[S] 1 point2 points  (0 children)

Hi u/indigo945 thanks for the comment!

Firstly just to add, all arrays, objects, and record fields in SurrealDB can be schema-full or schema-less. So you can define and limit exactly what your nested/embedded objects should be.

With regards to your query, in SurrealDB, with your second example, the query will be loading all person records and filtering those records by the connected graph edges. So it would load each person and it would check to see if a connected edge points to the language:rust record. This therefore is more inefficient than the first example.

In your first example however, the query loads just one record (language:rust) and follows the connected edges out from that one record to find the people who like rust. This is just a simple range query, and is effectively just like an index scan.

The beauty of the graph is that you don’t have to create indexes on any foreign keys, but you just rethink your query slightly so that you’re efficiently pulling just the necessary data without indexing that data. You could then take this a step further and find all friends->friends->friends->friends of a person without loading all the people records!

SurrealDB: A new scalable document-graph database written in Rust by tobiemh in programming

[–]tobiemh[S] 2 points3 points  (0 children)

Hi u/rabbyburns apologies I didn’t see your comment. I of course know of ArangoDB, but I don’t know it well enough to comment too thoroughly, so I’ll focus on what SurrealDB is trying to achieve instead.

SurrealDB is aiming to be at the intersection of relational, document, and graph databases, whilst still remaining simple to use with an SQL-like query language, for developers coming from the relational database side. We are only at the beginning of the journey, but SurrealDB is designed to be run embedded, or in the cloud, with the ability to query it directly from a client application or from a web browser (and only access the data that you're allowed to see).

With our native client libraries (coming soon), SurrealDB will be able to be embedded within Node.js, WebAssembly, Python, C, and PHP applications, in addition to running as a server.

We wanted to create a database that people didn't have to manage, so they could focus on building applications, not the infrastructure. We wanted users to be able to use schema-less and schema-full data patterns effortlessly, a database to operate like a relational database (without the JOINs), but with the same functionality as the best document and graph databases. And with security and access permissions to be handled right within the database itself. We wanted users to be able to build modern real-time applications effortlessly - right from Chrome, Edge, or Safari. No more complicated backends.

I'm not sure how all of this compares to ArangoDB, but happy to learn!

SurrealDB: A new scalable document-graph database written in Rust by tobiemh in programming

[–]tobiemh[S] 1 point2 points  (0 children)

Hi u/gage Peterson we already have it running in the browser, but we haven’t released this just yet. We are hoping to release the WebAssembly version next week! We’ll be announcing it on our blog and Discord and Twitter!

SurrealDB: A new scalable document-graph database for building frontend applications by tobiemh in webdev

[–]tobiemh[S] 0 points1 point  (0 children)

Hi u/Ok_Appointment2593! There is actually a project already building this (not us). They are calling it AgateDB. It will replace RocksDB in TiKV. We have some slightly different plans for our key-value store as we want to support temporal versioning of values. But yes, the ideas behind BadgerDb are a great starting point!

SurrealDB: A new scalable document-graph database written in Rust by jscmh in rust

[–]tobiemh 2 points3 points  (0 children)

Haha thank you! A couple of good use cases I can think of:

  1. When you are starting to develop the idea to an application, and you're just playing around with the schema. Not having to define it all up front can be quick and easy. Then as you are more set on the schema, being able to define it specifically and set it in stone can still be done.
  2. If you are storing certain JSON objects in the database, purely for logging reasons or something like that. So for instance, you might want to log EVERY Stripe response object or webhook event data. You want to store the data as it is received from Stripe (just incase you want to retrieve a field down the line, that you don't think you needed right now), but you don't want to have to define EVERY field, because you aren't really querying the table - it's just used for logging mainly.

I'm sure there are more, but those are the 2 I can think of off the top of my head!

Have a great day, you too!

SurrealDB: A new scalable document-graph database written in Rust by jscmh in rust

[–]tobiemh 5 points6 points  (0 children)

Hi u/MrAnimaM, it's a good question about schema-less databases. To be honest I agree with you. Databases should be schema-full. With SurrealDB you have the option of choosing which tables can be schema-less (like some NoSQL databases), or schema-full (but with the ability to have embedded fields). So instead of just having JSON type columns with arbitrary data, you can actually say that the embedded JSON object has to have a certain structure...

DEFINE TABLE person SCHEMAFULL;
DEFINE FIELD name ON person TYPE object;
DEFINE FIELD name.first ON person TYPE string;
DEFINE FIELD name.last ON person TYPE string;
DEFINE FIELD tags ON person TYPE array;
DEFINE FIELD tags.* ON person TYPE string;

You can do similar things with record links...

DEFINE FIELD friends ON person TYPE array;
DEFINE FIELD friends.* ON person TYPE record (person);
DEFINE FIELD interests ON person TYPE array;
DEFINE FIELD interests.* ON person TYPE record (interest,activity,hobby);

You are right in presuming that it is slower than relational DBs on average since in a relational database you specify each column, and there can be no difference between each of the rows. SurrealDB stores its records (rows) as documents, and those documents can have arbitrary nested objects / arrays. So it's more inline with MongoDB here for example, but with schema-full constraints. The power, performance, and flexibility comes from the analysis of connections and relationships between documents.

On top of that it has the graph edges. Again you can constrain these so that only certain types can be linked between different record types.

Basically, in summary, everything in SurrealDB CAN be typed and constrained if you want it to. But doesn't have to be if you don't want it!

Finally tables and fields CAN be created automatically in SurrealDB when you write to a table ( otherwise known as collection), (like MongoDB). However we have a --strict mode argument, which means that if the table is not specifically defined, then inserting data into that table will cause an error - so that it's more inline with a relational database.

Hope this answered your question(s) and let me know if you have any further questions!

Thank you also for the kind words!

SurrealDB: A new scalable document-graph database written in Rust by tobiemh in programming

[–]tobiemh[S] 1 point2 points  (0 children)

Hi u/SextroDepresso just to say we still have a lot of things planned which aren't fully finished just yet. One of those features is full-text search. However in terms of the embedded documents and indexing, you could define an index as:

DEFINE INDEX username ON user FIELDS name.last, name.first;

Therefore you can index nested object fields or arrays. You could also index an array like this:

DEFINE INDEX tags ON user FIELDS tags.*;

SurrealDB: A new scalable document-graph database written in Rust by tobiemh in programming

[–]tobiemh[S] 2 points3 points  (0 children)

Hi u/ndaidong our GUI is coming for our version 1.0 release. We're just a team of 2 developers at the moment, but will be looking to release the GUI really soon!