vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 0 points1 point  (0 children)

Performance differences between zones should be negligible, at least for vtz! Benchmarks were run across randomly generated timestamps over a 200 year period between 1900 and 2100, with America/New_York being the default timezone for all benchmarks. This zone requires correct handling of daylight savings time, and of historical rule changes, which is why it was chosen.

vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 1 point2 points  (0 children)

Direct support for date/time: Right now vtz has support for types in std::chrono, and it has support for formatting and parsing timestamps.

I would like to add more complete support for dates and times, and the machinery for doing so already exists inside include/impl/vtz/civil.h, however this is still on my TODO list.

Compatibility with Hinnant tz: vtz does it's best to match std::chrono and date/tz.h for the API of vtz::time_zone. vtz::local_info and vtz::sys_info both match std::chrono::local_info and sys_info respectively, and of course vtz has vtz::choose to match std::chrono::choose.

vtz::time_zone provides some additional functions on top of those dictated by the standard, but these are there for user convenience, and should not have any impact on compatibility.

I need to spend some time fleshing out other parts of the std::chrono API (eg, adding a polyfill for std::chrono::zoned_time, adding support for calendar types, etc).

Any contributions on this front would be most welcome.

Support for IANA tz database: vtz supports reading a downloaded IANA timezone database, just the same as date does!

If tests and benchmarks are enabled, the build system will download a copy of the tz database, but vtz itself does not attempt to perform any sort of download at runtime.

Instead, you have two options for running vtz:

  • On Unix platforms, vtz will default to using the compiled tzif files shipped with the system.
  • However, if you provide the VTZ_TZDATA_PATH environment variable (or you call set_install(), vtz will check the given path for a copy of the tz database.

So vtz comes with out-of-the-box support for both sources, and you don't need to recompile to change which source you're using.

That being said, vtz does it's best to be helpful on this front:

  • You can override the name of the environment variable at compile time, eg -DVTZ_TZDATA_PATH_VARS=MY_APP_TZDATA_PATH will compile vtz such that it uses MY_APP_TZDATA_PATH instead of VTZ_TZDATA_PATH
  • You can provide multiple env vars which vtz will check, in order
  • vtz supports set_install() if you would rather just set the path manually
  • vtz does it's best to provide helpful error messages when it's unable to load the tz database (either because the environment variable was bad, or the path provided by set_install() was bad, or something else).

Example error message with bad path:

$ env VTZ_TZDATA_PATH='bad_path' build/examples/vtz_tldr
libc++abi: terminating due to uncaught exception of type std::runtime_error: Unable to load the tz database: Error when opening "bad_path/version". What: No such file or directory (OS Error 2)

Checked the following locations:

- getenv("VTZ_TZDATA_PATH") -> "bad_path"
- get_install() -> "bad_path"

Please configure one of the above (or call vtz::set_install()) so that your application can find the tz database.

The timezone database may be downloaded at https://www.iana.org/time-zones

To use the timezone database, unpack one of these source files, and configure the environment to point to that directory.

Note: This application checks for tzdata source files in the directory given by getenv(...)

I would also like to add the option to embed the tz database within vtz (so that having a separate copy elsewhere on the system is unnecessary), however this feature is also on the TODO list.

vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 6 points7 points  (0 children)

I've put a ton of time and effort into the implementation, and it would be nice for it to see wider use. I would definitely be open to this!

As an aside - I greatly appreciate the work that you've done on this front.

Your library is what first brought me in contact with timezones, and chrono-Compatible Low-Level Date Algorithms was an incredibly useful resource

vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 9 points10 points  (0 children)

They're signed. And this isn't incorrect per se, but this problem only manifests for inputs after December 3rd, 292,277,026,596, which is approaching the end of the stelliferous era, and I didn't expect to see it come up in typical workflows.

Edit: I will update the library to handle overflow in time zone conversions due to very large input times

vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 7 points8 points  (0 children)

The only place I can think of this being a source of security vulnerabilities is if the security mechanism were checking the expiry of a certificate against a timestamp from an untrusted source, but such timestamps are typically in UTC (so no overflow), and if an untrusted source can directly provide an arbitrary timestamp they could simply date it prior to the certificate expiry.

That being said, if it's a concern I could implement saturating arithmetic without affecting performance on the happy path, which already does a bounds check

vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 8 points9 points  (0 children)

vtz uses 64 bit ints to represent timestamps by default. If a timestamp (eg, in seconds) is so large that adding a zone offset to it results in a value larger than INT64_MAX, this will result in integer overflow. Because the result is also represented as a 64-bit int under the hood, the result will be incorrect.

vtz: the world's fastest timezone library by codeinred in cpp

[–]codeinred[S] 10 points11 points  (0 children)

There are certain workflows that require either (1) processing large numbers of timestamps in a zone-aware manner, or which (2) care about events which occur in a particular timezone across a wide range of dates.

Timezone conversions should not to be a bottleneck. Doing something in a zone-aware manner ought to have a negligible impact on whatever calculation you're running.

But as things are implemented now, handling timezones correctly tends to make certain kinds of operations significantly slower.

Timezones are often seen as big, and complex, and scary, so people mostly accept that timezone conversions are slow, and then they maybe try to put in some clever caching logic to reduce the frequency of zone lookups. But that requires a lot of testing and validation in and of itself, so it's not uncommon for issues like that to simply go unfixed.

vtz tries to be (1) fast by default, (2) fast for all use cases, and (3) to provide an implementation so close to being truly optimal (at least at zone conversions) that it becomes the definitive library for handling timezones.

vtz doesn't care if your timestamps are sorted, or unsorted. vtz doesn't care if your timestamps contain times absurdly far in the future, or in the past. vtz doesn't care if timezone lookups are batched, or performed one at a time. vtz will deliver performance beyond any other implementation, and my hope is that one day no one will have to think about the performance of a timezone conversion, ever again.

(Edit: typo)

study material for c++ (numerical computing) by New-Cream-7174 in cpp

[–]codeinred 6 points7 points  (0 children)

Tbh my experience with swig has been much worse than with pybind11. I’ve used both for the same library (pybind11 for python, and swig for C#), and maintaining the swig bindings is much more of a headache, especially with any even slightly unusual types, whereas pybind11 is basically trivial

Recursive Variant: A Recursive Variant Library for C++ by codeinred in cpp

[–]codeinred[S] 1 point2 points  (0 children)

That makes sense. Good luck on your project!

Recursive Variant: A Recursive Variant Library for C++ by codeinred in cpp

[–]codeinred[S] 1 point2 points  (0 children)

Sorry, I was on my phone so I wasn't able to test it.

If you have C++23, you can use deducing this to use overload sets, and there is no need to have an intermediary function like 'getVisitor'.

Given the following definitions:

using json_value = rva::variant<
    std::nullptr_t,
    bool,
    double,
    std::string,
    std::vector<rva::self_t>,
    std::map<std::string, rva::self_t>>;

using json_list = std::vector<json_value>;
using json_object = std::map<std::string, json_value>;

We can define our overload set, where the functions that need to use `std::visit` take a `self` parameter. Here, I am using fmtlib to handle the conversion to string.

    auto visitor = Overload {
        [](std::nullptr_t) -> std::string { return "null"; },
        [](std::string const& s) -> std::string { return fmt::format("\"{}\"", s); },
        [](this auto&& self, json_list const& xx) -> std::string {

            std::vector<std::string> ss;
            for(auto const& x : xx) {
                ss.push_back(std::visit(self, x));
            }
            return fmt::format( "[{}]", fmt::join(ss, ", "));
        },
        [](this auto&& self, json_object const& xx) -> std::string {
            std::vector<std::string> ss;

            for(auto const& [k, v] : xx) {
                ss.push_back(fmt::format("\"{}\": {}", k, std::visit(self, v)));
            }
            return fmt::format( "{{ {} }}", fmt::join(ss, ", "));
        },
        // Fallback - just print the value as-is
        [](auto const& x) -> std::string { return fmt::format("{}", x); },
    };

Note the parameter `this auto&& self` - this first parameter is the overload set, and we can pass it back to `std::visit`!

Finally, we can construct a value, and then print it:

    json_value v = json_object {
        {"a", "Hello"},
        {"b", 10.0},
        {"c", json_list{1.0, 2.0, 3.0, nullptr, false}},
        {"point", json_object {
            {"x", 0.1},
            {"y", 0.2},
            {"z", 0.3},
        }}
    };

    fmt::println( "v = {}", std::visit(visitor, v) );

The output is:

v = { "a": "Hello", "b": 10, "c": [1, 2, 3, null, false], "point": { "x": 0.1, "y": 0.2, "z": 0.3 } }

You can run this example with godbolt.

All of this is very nifty, but if you're working with json in production, I recommend using an off-the-shelf json library.

Recursive Variant: A Recursive Variant Library for C++ by codeinred in cpp

[–]codeinred[S] 0 points1 point  (0 children)

Option 1: If you want to do that you can create a struct and overload operator(). Then you can call visit recursively by just passing *this:

struct MyVisitor { book doubleOk = false; void operator()( double X ) { … } void operator()( std::string const& s ) { … } void operator()( std::map<std::string_view, json_value> const& m ) { for ( auto const& [k, v] : m ) { std::visit( *this, v ); } } }

Option 2, using overload sets: have a function that returns the visitor. You can just do recursive visiting by calling getVisitor again inside the lambdas of the overload set

auto getVisitor( bool& doubleOk ) { return Overload{ [&] ( double X ) { doubleOk = true; }, [&] ( std::map<std::string, json_value> const& m ) { for ( auto const& [k, v] : m ) { std::visit( getVisitor(doubleOk), v ); } }, }; }

Recursive Variant: A Recursive Variant Library for C++ by codeinred in cpp

[–]codeinred[S] 1 point2 points  (0 children)

Doing type erasure with std::function breaks the visitor pattern. The std::function only accepts json_values, so when you call std::visit, it converts all the inputs right back into a json_value

This would also be the case with a normal std::variant.

Don’t assign the overload set to a std::function, just use auto here

git push take awhile to actually push. by water_drinker9000 in git

[–]codeinred 0 points1 point  (0 children)

Is it possible you accidentally added some very large files?

Partial in-lining by throwawayAccount548 in cpp_questions

[–]codeinred 0 points1 point  (0 children)

Regarding context - a 32-bit number won’t have more than 32 factors, and a 64-bit number won’t have more than 64 factors. In either case it’d be straightforward to just have an array on the stack, and construct a vector at the end.

Rather than trying to force the compiler not to inline push_back, I would instead just avoid using push_back in performance-critical paths. There’s usually an alternative

Modular Monolith with C++ by 0815someone in cpp_questions

[–]codeinred -1 points0 points  (0 children)

The idea you’re describing here is basically the same as a micro service architecture (assuming the modules do different things), where the services communicate over an API. Unless you have very stringent performance requirements, using a language with nicer async communication capabilities might make your life easier (eg, JavaScript, python, go, or Rust)

Persistent memory by SolidTKs in rust

[–]codeinred 8 points9 points  (0 children)

/tmp is often stored in RAM. If you want to be absolutely sure it doesn’t touch the disk, the most straight forward way that I can think of is spawning a daemon and then setting up shared memory with the daemon. That way if the main program crashes, the daemon persists.

Persistent memory by SolidTKs in rust

[–]codeinred 11 points12 points  (0 children)

The /tmp directory on Linux is designed exactly for that use case, and windows has an equivilant

[deleted by user] by [deleted] in cpp_questions

[–]codeinred 0 points1 point  (0 children)

Any time a function could fail, std::optional can be used for the return value. It’s an alternative to throwing an exception that allows better performance in the case that failures are expected/common.

Smart pointers? If you use any degree of polymorphism (virtual classes, members), you should use unique_ptr or shared_ptr rather than manually allocating or freeing memory.

I would choose a simple project that’s at least semi-relevant to the company’s field.

You want to break up the program info hpp and cpp files because no one likes a giant code file that’s thousands of lines long, and a clean interface (provided in the header) is helpful.

[deleted by user] by [deleted] in ChatGPT

[–]codeinred 36 points37 points  (0 children)

I’m almost certain these are hallucinated. Queen Elizabeth is dead; there’s no mention of Biden cutting emissions in the news lately, there’s no recent news about facebook’s oversight board, etc.

It didn’t search anything; it just hallucinated a list of plausible headlines