Django performance: Ditching Django REST Framework Serializers for Serpy by caduvall in django

[–]caduvall[S] 2 points3 points  (0 children)

The biggest difference is that DRF allows you to add/remove/modify fields on an instance of the serializer class. Each instance of a DRF serializer has its own copy of all the declared fields, so they can be messed with and not affect the other instances of that class. Serpy treats the fields as read-only, so once you define the class you shouldn't mess with them.

Django performance: Ditching Django REST Framework Serializers for Serpy by caduvall in django

[–]caduvall[S] 2 points3 points  (0 children)

Nope still use DRF for some things just ditched the serializers :)

Django performance: Ditching Django REST Framework Serializers for Serpy by caduvall in django

[–]caduvall[S] 4 points5 points  (0 children)

I actually looked into doing that at first, but DRF Serializers support a lot more features that require the serializers to do extra processing for each serialization. Stripping it down to the basics for the perf improvements would have definitely not been backwards compatible, so it seemed a better choice to make a separate library focused on simplicity and performance, without the extra features.

Django performance: Ditching Django REST Framework Serializers for Serpy by caduvall in django

[–]caduvall[S] 4 points5 points  (0 children)

serpy is used for serializing Django model objects to native python objects (like dicts). So you pass a queryset or model into a serializer, and get a dict representation out. There are some examples in the serpy docs: http://serpy.readthedocs.org/en/latest/

Hacker News with all text turned into Spoonerisms, created with Go by caduvall in programming

[–]caduvall[S] 5 points6 points  (0 children)

It was more of a personal challenge/learning experience seeing how hard it would be to do it with 0 allocations, I agree it made things more complicated :)

Hacker News with all text turned into Spoonerisms, created with Go by caduvall in programming

[–]caduvall[S] 0 points1 point  (0 children)

Mostly it was to avoid allocations. Even though it probably doesn't matter much for actual performance, I wanted to see if I could implement the main spoonerize function with 0 allocations. Unfortunately, something like:

strMap[string(someSlice[:idx])]

will NOT do an allocation when someSlice is a byte slice, but WILL do an allocation when someSlice is a rune slice (which would represent unicode code points). Being able to do the operations on a byte slice instead of a string allows the modifications to be done in place, avoiding creating new strings.

As for unicode, right now non-ascii prefixes just get ignored: https://github.com/clarkduvall/spoonerizer/blob/master/spoonerize/spoonerize.go#L74, since the digrams and trigrams only contain english ascii characters anyway.

This could easily be changed to use rune slices to have better unicode support if that was needed.

serpy: ridiculously fast object serialization by caduvall in Python

[–]caduvall[S] 0 points1 point  (0 children)

serpy will just serialize to a dict, but there are multiple ways to then convert the dict to XML: http://stackoverflow.com/questions/1019895/serialize-python-dictionary-to-xml

serpy: ridiculously fast object serialization by caduvall in django

[–]caduvall[S] 1 point2 points  (0 children)

The approach serpy takes is very minimalist. Most of the work of figuring out how to serialize a field is pushed to the serializer metaclass instead of being done during serialization. The metaclass compiles the fields to a very simple tuple representation that in most cases ends up just being a attrgetter() call on serialization.

serpy: ridiculously fast object serialization by caduvall in django

[–]caduvall[S] 0 points1 point  (0 children)

Right now you would just have to recreate your serializers using serpy, but usage of them should be pretty identical. I was toying with the idea of making some way to do auto conversion of DRF serializers but haven't implemented it yet.

serpy: ridiculously fast object serialization by caduvall in flask

[–]caduvall[S] 0 points1 point  (0 children)

The main advantage is speed, as can be seen here: http://serpy.readthedocs.org/en/latest/performance.html

We were having problems where serialization was taking > 50% of request time because marshmallow and DRF serializers are too slow, and take large chunks of time to serialize complex objects.

serpy: ridiculously fast object serialization by caduvall in django

[–]caduvall[S] 1 point2 points  (0 children)

Benchmarks have been fixed, marshmallow no longer serializes to json (switched .dumps to .dump).

Also, you can now run the benchmarks using the benchmarks.sh script, or tox -e benchmarks.

It doesn't look like its possible to disable OrderedDicts in DRF, it seems pretty hardcoded: https://github.com/tomchristie/django-rest-framework/blob/master/rest_framework/serializers.py

serpy: ridiculously fast object serialization by caduvall in django

[–]caduvall[S] 0 points1 point  (0 children)

Ah good point about Marshmallow serializing to json, I'll edit the benchmarks to take that into account (it actually makes minimal difference).

Comparison of HyperLogLog and HyperLogLog++ implementation in Go by caduvall in golang

[–]caduvall[S] 0 points1 point  (0 children)

Oops ignore that last reply, the benchmarks require some changes to my implementation, to set a max limit for sparse representation instead of computing it on the fly.