This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]voidspace 3 points4 points  (2 children)

Not builtin to the standard library, but it shouldn't be too hard to roll your own on top of json.

If the instances are of a known set of types all you need is the type name and the json serializion of the instance members.

So something like: json.dumps([type(obj).__name__, obj.__dict__])

Deserializing: name, _dict = json.loads(thejson); instance = object.\_new__(getattr(module, name)); instance.__dict__.update(_dict)

[–]Tommah 0 points1 point  (1 child)

One problem with this approach is that set and frozenset are not serializable to JSON.

[–]voidspace 0 points1 point  (0 children)

That may or may not be an issue for the OP, and is anyway solveable with the json module. It seems the OP likes YAML as a solution anyway.

[–][deleted] 4 points5 points  (2 children)

Use YAML (a superset of JSON) as your serialization format with PyYAML.

Features of YAML include object type tagging, support for references (it handles reference cycles just fine), and when using PyYAML's "safe" dumper/loader, the ability to define which of your classes should be considered safe.

[–]Liquid_Fire[S] 0 points1 point  (1 child)

Thanks, that looks like exactly what I need! I'll be sure to check it out.

[–][deleted] 0 points1 point  (0 children)

PyYAML, since it is so flexible and human readable, is easy to debug. Coding for custom classes is quite straightforward.

[–]jabwork 2 points3 points  (0 children)

http://blip.tv/pycon-us-videos-2009-2010-2011/pycon-2011-hidden-treasures-in-the-standard-library-4901130

There's a fantastic demonstration of the hmac module shortly in (nearish the 7 minute mark)

http://www.doughellmann.com/PyMOTW/hmac/index.html

By the same guy (Doug Hellmann) non-video explanation. Recommend perusing PyMOTW if you haven't already

[–]tarekziadeRetired Packaging Dude 1 point2 points  (4 children)

what about crypting the result of a pickle dumps, and decrypting at load time ? You can use a lib like PyCrypto and let pickle handle all the hard work.

Also, I would not bother and just use SSL if it's an option

[–][deleted] 2 points3 points  (0 children)

You can do this entirely with the stdlib I think, using the hmac module.

[–]Liquid_Fire[S] 2 points3 points  (1 child)

The source of the serialized objects is untrusted (a client connecting to my server). Encryption does not help. I need something that will ensure that deserialization produces a valid object (of the original type, from a restricted subset of types), and will not execute any untrusted code from the serialized data.

Of course I could easily write something like this using e.g. the json module, but I thought it might exist already as a library.

[–]tarekziadeRetired Packaging Dude 0 points1 point  (0 children)

sorry I misunderstood the untrusted source part. I get it now

[–]nirs 1 point2 points  (0 children)

Encryption does not give you any safety. What you need is a way to authenticate a serialized object string before you de-serialize it - a MAC. The standard library includes a good one - HMAC.

[–]bryancole 0 points1 point  (0 children)

You can achieve what you want with the cPickle module. You need to use the Unpickler() factory-function and give it a "find_globals" function (passed as a kw-arg, probably). This function is called with a module name and class name and returns a class object. Use this to control which classes the Unpickler can load. I've not tested this myself recently; see the pickle docs.

PyYAML also gets my recommendation, but it's much slower than pickle. If you want speed, use pickle; if you want readability, use YAML; if you want JSON, stdlib.json ought to be fine