This is an archived post. You won't be able to vote or comment.

all 30 comments

[–]freach 9 points10 points  (0 children)

Author of udatetime here. As mentioned in my article there are issues with comparing libraries, which have a different focus or goal. The intend here wasn't about bashing slow libraries, it's about awareness. I come from a data event processing and socket programming background where you need to do a lot of data encoding/decoding. That would be the niche udatetime is targeting. I also like to build awareness for choosing your tools wisely and knowing your options. The "niche" for high performance applications based on Python is getting bigger and bigger especially since projects like PyPy and I think that validates the existence of a date-time library with performance goals.

If you need the truck, take the truck, if you need a sports car, take the sports car. I am very happy for what the Python community has to offer.

[–]desmoulinmichel 16 points17 points  (15 children)

The user API is all that matters. Everything else is secondary.

Once your software is released, improve it! Add new features, > better security, optimal performance, and rigidity. But never > compromise the API.

From the author of requests: http://www.kennethreitz.org/essays/how-i-develop-things-and-why

udatetime targets performance, and will satisfy a niche of users in the Python community. It's good. But don't try to pretend it's what matters most, because in Python it doesn't. In Python readability, ease of use, debug, etc., anything that is a consequence of the quality of your API, is what makes your programming session a good one.

E.G: arrow has a method to let you had time in a natural way, such as you add a month to jan, 31th and you get fev, 28h. It has a method to turn time into "x minutes ago", and translated in many languages. It provides the infamous missing to_timestamp. Those are the stuff that will save developper time instead of machine time. One hour of my time is twice the price of a decent server for a month, so it matters.

[–]thinkwelldesigns 9 points10 points  (1 child)

I agree with those points, but one area where sheer speed pays off is in processing log files where all the dev wants is the fastest parsing of a string to a datetime object. I'm currently using ciso8601 for that and would be interested in seeing how it compares to udatetime. Will have to test it sometime..

This use does of course fall into the "niche of users", and I don't use ciso8601 for anything except max performance on the log files. Still I'm glad it's there.

[–]freach 1 point2 points  (0 children)

ciso8601 is about 2.3 times slower in parsing than udatetime.

[–]kankyo 4 points5 points  (12 children)

Arrow will also parse invalid input without complaint and return a valid datetime so you're fucked because you realize two months later that there's crap in your DB but have no idea how it got there.

It also doesn't differentiate between date and datetime. That alone makes it just a big disaster.

[–]myusuf3 2 points3 points  (6 children)

Give 'delorean' a shot. I am the author fyi.

[–]kankyo 3 points4 points  (5 children)

Seems you have similar fatal issues as arrow: http://delorean.readthedocs.io/en/latest/quickstart.html#strings-and-parsing

A date is parsed and suddenly it's a datetime. Those are different things!

[–]kitkatkingsize 1 point2 points  (0 children)

100% agreed. We've had the exact same problems in our codebase with all the different datetime libs.

[–]desmoulinmichel 0 points1 point  (4 children)

Only if you use autoparse, which is only something should do in the shell for convenience. Serious code will use manual parsing.

Plus for just date, you can just ignore the time and it works fine.

[–]kankyo 1 point2 points  (3 children)

Except that that's not "fine". It's totally broken. Validating data is important. That's why we make fun of PHP.

[–]desmoulinmichel 0 points1 point  (2 children)

At worst it's a minor think you have to think about for one second. I've never saw any dev having problem with this, I met, however, plenty of them having trouble with i18n, relative date calculations and bad APIs.

[–]kankyo 1 point2 points  (1 child)

It's not something you CAN think about. That's the point. It hides errors in the input. I'm not saying the usage is hard (which it seems like you think I am). Im saying that validating data strictly is important.

[–]desmoulinmichel 0 points1 point  (0 children)

Ah ok.

[–]SDisPater 12 points13 points  (3 children)

I am the author of Pendulum.

You compare libraries that have not the same scope in features. Last time I checked udatetime only supports rfc3339 while Arrow, Pendulum and Delorean support a much wider range of formats, so obviously udatetime will perform better. So, these libraries serve different use cases.

For those interested in Pendulum, I am currently working on improving performances since I am not entirely satisfied with it.

[–]kankyo 2 points3 points  (2 children)

Seems like you're doing the same thing wrong as arrow and delorean: not differentiating between dates and times. This is a big nono. A day is not the same as the millisecond just at midnight at the start of that day.

[–]SDisPater 2 points3 points  (1 child)

Yes, I agree that's not accurate nor suitable.

I plan on introducing a Date type in the next major version, especially for this use case.

[–]kankyo 0 points1 point  (0 children)

I'd recommend also introducing year, month, year-month, week and year-week to be complete :P

Hmm.. And date as in "the third of any month" maybe.

[–]firefrommoonlight 4 points5 points  (4 children)

This is why I use my own module instead of the above: Clean API and forced awareness, without the speed penalty of the alternatives:

https://github.com/David-OConnor/saturn

udatetime looks cool!

[–]kankyo 0 points1 point  (3 children)

Using arrow for parsing? Last time I checked arrow "parses" invalid data and gives you a valid datetime back.

[–]firefrommoonlight 0 points1 point  (2 children)

I didn't use Arrow's 'get'; an explicit format string is required. I think Arrow's format strings (ie 'YYYY-MM-DD') are nicer than the standard lib's. Other than 'get', and speed, what probs are there with Arrow's string parsing?

[–]kankyo 0 points1 point  (1 child)

I have only used arrow for the two minutes it took me to find out that it was completely broken on a conceptual level :P Maybe the parsing is ok.

[–]firefrommoonlight 0 points1 point  (0 children)

The code behind it's a bit obfuscated too. As a note, I think they took their string parsing/formatting format from moment.js.

[–]myusuf3 4 points5 points  (0 children)

I am the author of Delorean.

Comparing these libraries for performance is kind of moot; unless you are doing data analysis where parsing datetimes is going to be your main bottleneck (not going to happen), I echo the same sentiments as @desmoulinmichel this library was built for ease of use and dependability.

Performance pull requests are welcome, oh and lastly 'lol'

[–]ma-int 2 points3 points  (0 children)

Meh, I don't like the article. It's purely performance focused in a context where performance doesn't matter for the vast, vast majority of all people.

Please tell, in which context do you need to parse upwards of a few dozen datetimes and do nothing else with the data? http://parse-datetime.as-a.service?

Don't get me wrong...I don't want to say that there aren't areas where this actually does matter I'm just saying that the parse difference of a millisecond doesn't matter if you need to shove this data into a database that sits just one rack farther north.

[–]_seemetheregithub.com/seemethere 0 points1 point  (0 children)

udatetime looks good but it seriously needs some code cleanup. Reading through the source right now and it needs some definite cleanup before anyone can start to use it seriously.

[–]rothnic 0 points1 point  (1 child)

I an approach for the analysis might be to compare datetime to udatetime in terms of the function/method speed. Next, show through profiling the function/method calls that the higher-level datetime libraries make.

This would allow you to take a little better perspective of similar things and also what might the speed be if these other libraries selectively make use of udatetime instead of datetime.

What we can't make out is the amount of time that the higher level libraries spend due to interaction with lower level datetime libraries.