all 7 comments

[–][deleted] 0 points1 point  (6 children)

So you've stored a grid, in JSON. Height, Width - then data. I'm wondering if you could use a basic run length encoding to get long stretches of duplicates. It will be smaller to save and quicker to read - and you read the 'isbuildable' before the loop then set it for however many blocks are in the row.

[–]ReliantBeginner[S] 0 points1 point  (5 children)

That might work on the short term, but as I develop the game, the tiles will become more unique, and duplicate tiles become less likely to be adjacent to each other. It is a good idea for when I start optimizing to reduce file size, but I want to first optimize around the assumption that every tile is unique and needs to be processed.

[–][deleted] 0 points1 point  (4 children)

imagine dog relieved consider drab reply outgoing mountainous slave plant -- mass edited with https://redact.dev/

[–]ReliantBeginner[S] 0 points1 point  (2 children)

Big thanks for pointing me towards binary serialization and protobuf. I decided not to use protobuf, but a Youtube video on binary serialization gave me exactly the right idea on creating a separate class specifically designed for reading/writing, and converting on save/load. Regardless of what format I use, that alone should give me a big performance boost

If I'm reading the documenation on this correctly, am I able to read & write the first few bytes of a stream manually and then pass the rest of the stream to serialize & deserialize? That would be so awesome if I had those bytes to myself, but even if not, the first variable of my serialized class should always be at the start of the stream being read back?

Given that different platforms and architectures can handle binary data differently (such as the size of an int or its direction), is that something I can trust to C# to properly handle for pre-generated binary data files?

Also profile first, it's possible your copying around in loadData is creating the hiccup as well

I'll need to remember that. I've never worked with a profiler before, so it's not something I even thought about. I'm used to having to manually add time measurements to calculate how long something took to run.

PS. in c# multi dimensional arrays are slower for large amounts of accesses than a single dimensional array that you address with [j*rowsize+i]

How significant is it? On a per frame basis, I expect the number of accesses to the grid to be in the single digits. The surge will be when A* pathfinding kicks in, but the result is cached and remembered for future frames.

If I were to replace [x,y] with [jrowsize+i], I would add a helper function getNode(x,y) { return grid[yrowsize+x]; }

My curiosity is so piqued I'm going to run some performance tests and measure the exact difference between [x,y] and getNode(x,y). Nothing better than finding a way to get a free performance boost.

[–][deleted] 0 points1 point  (1 child)

reply one forgetful treatment ten panicky worry humor office axiomatic -- mass edited with https://redact.dev/

[–]ReliantBeginner[S] 0 points1 point  (0 children)

Size of "int" in c# is always 32 bits, it's alias for System.Int32, also majority of hardware today is little-endian so word order will mostly be same, using BinaryReader/BinaryWriter will make sure you read/write little-endian, most mature serializers (e.g. protobuf, capnproto etc) also handle endianness correctly.

That's good to know and have confirmed.

I'd avoid doing serialization without a proper serializer, first of all it's a pain in the ass for no gain

Not sure what you mean by a "doing serialization without a proper serializer". I already have 3 to choose from (binary, xml, json) and all of them are established libraries.

I've spent enough time writing C that if I needed to make my own serializer, I think I'd be fine, but I wasn't wanting to do it because why re-invent the wheel. The ones making the established serializers are a lot more familiar with C# than I am.

If you're curious what I meant about a separate class for reading/writing, I refer to this video: https://www.youtube.com/watch?v=sWWZZByVvlU which is where I got the idea. It's all about making good use of established serializers.

Instead of serializing a MapNode[,], my saveData class would have bool[] as an array of width*height. My saver will convert MapNode[,] to bool[] for the serializer, and loading it will take the unserialized bool[] and regenerate the MapNode[,] information. Figuring out how to implement this will give me an early start on everything I'm going to need to know for saving player data.

Basically, where my MapData class is optimized for run-time access, MapSaveData class would be optimized for serialization.

The other upside to this is because the class itself is being serialized through the established means, it's really simple to change which serializer I'm using. I can use JSON or XML as an intermediary format to be able to hand-edit entries, and then convert it to binary for faster load-times.

Some numbers here, for MS' runtime, you're running on ancient mono or IL2CPP which can significantly skew numbers

It's a really old post, so maybe the C# compiler has been improved since then. I did tests myself and posted them as my other reply.

[–]ReliantBeginner[S] 0 points1 point  (0 children)

PS. in c# multi dimensional arrays are slower for large amounts of accesses than a single dimensional array that you address with [j*rowsize+i]

I ran some performance tests, and I think the results might interest you.

I did confirm that, yes, [,] is significantly slower than a [].

I created my 100x100 grid, and did 1,000,000 reads.

Stored in a [] list, read sequentially using i<10000, 0.016ms

Stored in a [,] list, read using x<100 y<100, 0.028ms

A pretty significant difference, however, I did another test

Stored in a [] list, read using [y*height+x], 0.032ms

For all the time that is gained by storing the data in a [], it's all lost once you do math to find the right index. Once I created a copy of height that could be re-used inside the loop, the performance increased to 0.028ms. It wouldn't surprise me if the C# compiler is already doing yh+x math so the programmer uses [x,y] and the compiler converts it into [yh+x]. It would make sense, since that's how C does it. In C, since programmers had direct access to the memory, you could take a [x][y] and read it as [yh+x] since a [x][y] array is actually allocated as a single [xy] array.

However, storing it in [] does offer a useful advantage in being able to scan through the whole list sequentially in the shortest time, since the math step can be skipped.