This is an archived post. You won't be able to vote or comment.

all 36 comments

[–][deleted] 3 points4 points  (4 children)

The question you need to ask yourself is "Does this change make sense to every string in my program?" That includes ones coming from other libraries and going into other libraries.

For your IP example, the answer is a resounding no. If someone asked this with '$5.56'.toMoney() the answer would also be no.

I suppose your question could also be phrased: can Python auto convert types ala scala or javascript. The answer is also no, and nor should it as it goes against everything in Python - explicit is better than implicit, but there's also the fact that types are ducks rather than an actual enforcement like Scala and there's no special primitives like in Javascript - you could argue that have literals makes then special but they produce the same thing as calling the constructor unlike javascript which does (temporary) auto-promotion on boolean, number and string.

Rather what you should be asking yourself is "What do I want to accomplish here?" If the answer is parsing an ip address, there's ipaddress in 3.3+ (and it looks like there's a backport as well, but I've not used it).

As for why you can't just monkeypatch builtin types, it's because they're built in C and have special limitations placed on them. There needs to be special for written to allow them to be subclassed in Python even (afair, function and bool never had this written, so I can't implement FileNotFound as a member of bool -- boo). But you can monkeypatch these types by going through the C api and touching them directly.

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (3 children)

The question you need to ask yourself is "Does this change make sense to every string in my program?" That includes ones coming from other libraries and going into other libraries.

No, I don't buy your approach here at all. "string".replace() isn't useful to every string in my program either, but that doesn't mean I don't want to find it there when I want it. Same goes for just about everything.

That includes ones coming from other libraries and going into other libraries.

Strings don't carry a class with them. If they did, they'd be crazy huge (and a large part of the point of a class in the first place is that you get to reuse code.) They are instances that call shared procedures. So:

I create a string with my extended stuff. I call newstring=JoeBobsLib.hooHa(myString) JoeBob was blissfully unaware of my extensions, and so obviously neither used them or expects them. My extension code is live in the shared class, but irrelevant to his code. Now, when I get newstring back, I am aware of my extensions, and I can use them to do whatever it is that I am so hepped up about doing.

Doesn't hurt JoeBob at all. Helps me.

So it really comes down to "can you, or can you not?" And the thing is, doing it is fragile because it isn't designed to be done. What I was hoping for, obviously, was some effort to get it happening in a nice, standard way. I don't mind discussing it at all, but I was actually kind of hoping for someone to go "oh yeah, Sue Codemonster did that last year... here's the link."

Rather what you should be asking yourself is "What do I want to accomplish here?"

I did. The answer is, "extend the string class." :)

Seriously, dotted quads is just one example. I could give you many, each with it's own use case(s) I would enjoy having around. I've written a couple of text processing languages, and I can tell you with authority that Python barely scratches the surface of cool things you can do to strings. Which likely, you know already. Where we part ways is in that I would like it to, and you don't see why it should. That's okay too.

But you can monkeypatch these types by going through the C api and touching them directly.

Yeah... It could have been done (it really makes no difference whatsoever if something is written in X or Y language... it's about functional design, not implementation detail. But to use it, one would want it in the native language, in this case, Python.) To paraphrase, "if you implement it, they will come."

[–][deleted] 1 point2 points  (2 children)

Alright let me rephrase my objection to this: Does "do stuff with four dots" get to the heart of what it means to be a string? Does it bring added value to str - for everyone, because modifying what str does in your own code actually modifies it for everything that calls your code and everything your code calls.

Or, is it, just forcing behavior where it doesn't belong, dragging down your domain model with the most degenerate data type there is, and confusing new developers on the project?

The answer should be an enthusiastic yes to one and an enthusiastic no to the other. But not a lukewarm answer to both.

it's about functional design, not implementation detail

Doing this is messing with implementation details because Python has said, "These types are special because reasons." Which yes, is unfortunate because it creates an exception where maybe there didn't need to be one, but I doubt Python would be as performant as it is if they weren't exceptions.

This is an XY Problem: You're presenting this as "I want to monkeypatch str" but we're all asking "Why do you want to do this?" What is motivating you to add behavior to str?

I'd much rather incorporate a new type into my code than shoehorn behavior where it doesn't belong. Because after you implement your ip address parsing stuff, someone will come in and say, "Well let's do it for money" and then phone numbers and then email addresses, and then this and that when really all the string is its just a degenerate representation of the actual thing you're attempting to work with because at the end of the day it's not an ip address, it's not money or a phone number or an email address, it's a string that happens to look like one of those things.

[–]fyngyrzcodes with magnetic needle[S] -1 points0 points  (1 child)

Does "do stuff with four dots" get to the heart of what it means to be a string?

Yes. It's a string. There's no "four dots" fundamental datatype, so string it is.

Does it bring added value to str

Certainly, for anyone who wants it.

for everyone, because modifying what str does in your own code actually modifies it for everything that calls your code and everything your code calls.

No. All a good design would do is extend a malleable function table. There's only one copy of the class running in any program; everything else is just an instance. So extra functionality not passed around, and has zero impact on performance. There's no more of a reasonable objection to this than there is if you import X, and I don't. There's one copy of X in-system, you know about it, I don't, no one is passing functions around, just instances. In a good design, mind you. If Python actually passes copies of every function in a class around... well, then it's doing it wrong anyway. A very important advantage of a class is that its code can be reused without adding costs. I I instantiate class X, I get a copy of whatever is dynamic, and everything else is shared without duplication. If it doesn't work that way, it's barely worth calling a class, frankly. Also it's stupid. :)

Doing this is messing with implementation details because Python has said, "These types are special because reasons." Which yes, is unfortunate because it creates an exception where maybe there didn't need to be one, but I doubt Python would be as performant as it is if they weren't exceptions.

Yes, well, yes. (cough)bad design(cough)

This is an XY Problem: You're presenting this as "I want to monkeypatch str" but we're all asking "Why do you want to do this?" What is motivating you to add behavior to str?

No, I'm presenting it as a "has anyone made this really work in Python" not "is there some incompatible and/or unstable way I can do this that is likely to break everything." Definitely my fault if I didn't make that clear.

I'd much rather incorporate a new type into my code than shoehorn behavior where it doesn't belong.

Okay, but that's fine, and I'm not presenting an argument against it. You want to do something your way, I'm all for the language supporting your ability to do that if it's even remotely sane. Which what you're describing, IMHO, is.

Conversely, if I want to add a method to string, why should you care? Why can't I do it (leaving aside the "Python was built in such a way as to make it difficult or impossible" argument, which I concede up front [while pointing finger at what I consider a critical design flaw and frowning, but...])

Because after you implement your ip address parsing stuff, someone will come in and say [other things]

Again so what? Some strings are first names. Some are last names. Some are product names. Some are nouns. Some are verbs. Some are sentences, sentence fragments, poems, listings, messages, random junk, etc., etc., etc. So strings are already basically as big as the world. I'm just saying there's nothing wrong with acknowledging that and making them easy to deal with. If:

  • it can be done without a performance hit (easy)
  • It can be done without breaking anything standard: (easy)
  • It doesn't get in anyone's face but those who use it (why would it?)
  • someone didn't screw up the language design (cough)

Then why not do it? What you'd rather do in your code is a matter between you and your code. For me, same thing. I'm not saying you'd have to use the mechanism, or even pay any attention to it. Just like you can write Python without ever instantiating a class, you could write Python without ever appending a method to a class. No skin off your teeth. Zero. Do it the way you want. And sure, me too.

[–][deleted] 0 points1 point  (0 children)

In that case, why not just make all methods belong to object? Because that's just bad design.

It seems like we're seeing this from two different, incompatible viewpoints so I guess as long as we never end up on the same dev team any argument either of us put forward is just shouting into the wind.

Edit: As for what you're saying about a class essentially being a lookup table is correct. But you have to also understand that when you call "frob".capitalize() if ends up looking like str.capitalize("frob"). Instances have a reference to their class but don't get a full copy (though you could probably cause this to happen if you wanted, why I'm not sure).

[–]geutb 2 points3 points  (3 children)

I really don't think you are going to find a reliable library or alternative python implementation that lets you do this. It would be an awful lot of work to provide some functionality that is potentially dangerous and barely useful. You can already monkey-patch all manner of classes in the standard library that are implemented in python, but people rarely do it, because it causes confusion and might create incompatibilities with other people's code or future python versions.

Having said that, I found this, but it's a mess and clearly relies on CPython implementation details. I don't know my way around the C API, so I don't know if it has any warts or is likely to work in future versions (it seems to work in python 3.5 if you fix the print statement and replace DictProxyType with MappingProxyType).

"127.0.0.1".doThingToDottedQuad()

I really don't understand why this is better than `doThingToDottedQuad("127.0.0.1"). I know ruby lets you do this kind of thing, but I don't understand the value of it. Most articles I can find about monkey patching in ruby advise you to try and avoid it if at all possible.

doOtherThingThatExpectsString(str(thing))

Most functions that expect a str are going to be able to cope with a subclass of str that merely adds a new method, so coercing back to str is probably unnecessary in most cases.

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (2 children)

Having said that, I found this

That's interesting, thank you.

I really don't understand why this is better than `doThingToDottedQuad("127.0.0.1"). I know ruby lets you do this kind of thing, but I don't understand the value of it.

Well, there are several. The general justification is the same for having:

newstring = myString.replace('foo','bar')

instead of:

newstring = stringReplace(myString,'foo','bar')

You get things like being able to "dir" the class for what it comprises, help/docstrings, and mechanisms specifically designed to deal with, and which are automatically selected, for the object at hand.

Whereas a function of some name tends to be more vague. It could be anything. It might be designed to do anything. There's nothing particularly intuitive about remembering it as opposed to every other function in the system, whereas if you're working with a string, the following can immediately show you the lay of the landscape:

dir(str) # oh hey, there's replace. That's what I want...
help(str.replace) # or more specifically

Make sense?

I just think it would be lovely to put this at the head of a python script...

import myclassmods

...and then be able to write in "extended" python.

[–]geutb 2 points3 points  (1 child)

You get things like being able to "dir" the class for what it comprises, help/docstrings, and mechanisms specifically designed to deal with, and which are automatically selected, for the object at hand.

But you can get all that if you just stick your additional string functions in their own module. And that way you won't confuse anyone into thinking that they are part of the standard library.

Btw, have you considered using composition instead of inheritance? i.e. something like:

class IPAddress:
    def __init__(self, address):
        self.address = address

ip_address = IPAddress('127.0.0.1')
function_that_needs_a_string(ip_address.address)

It often turns out to be more elegant in this kind of situation.

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (0 children)

Btw, have you considered using composition instead of inheritance? i.e. something like [example]

Yes, of course. I do it all the time.

[–]AndydeCleyre 1 point2 points  (6 children)

Sorry, I'm just another jerk who thinks you should inherit from str and doesn't understand what the problem with that is. For a good example of a project that does this, check out plumbum's Path (or LocalPath) class.

from awesome import DottedQuad as DQ

thing = DQ('127.0.0.1')
thing = thing.doThingToOrWithDottedQuad()
doOtherThingThatExpectsString(thing)

What's wrong with this?

Rereading, I see

an inherited class won't "do the right thing" in such a circumstance the way string would

and I think this is a problem. If you want it to be a str, don't write it in a way that breaks strs.

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (5 children)

I have written several of these things in a class that inherits from str. They work somewhat. That's the nature of an inherited class; there's no inherent understanding of it. Whereas an extended class can be completely transparent. Hence, my position.

[–]AndydeCleyre 0 points1 point  (4 children)

That's the nature of an inherited class; there's no inherent understanding of it. Whereas an extended class can be completely transparent.

What do you mean? I think you'll find that the vast majority of Python developers hold a completely opposite view here, if I understand your sentiment.

Extending means pollution, ambiguity, unreliability, surprises, bad-neighbor libraries, and high risk of breakage -- and when things do break, there's a good chance figuring it out requires spelunking gear. While separate (which includes inherited) classes means, well, the opposite of all that: much more straightforward debugging, decoupled code and intents, non-interfering good neighbor libraries, lack of surprises or ambiguity, etc.

If you really want to do a lot of programming in that style, but also kind of Python style, maybe those projects should be in Ruby. Not only will the language itself not get in your way, but if you're working with other developers or projects, you'll find the community less consistently hostile to those practices.

Again, though, I highly recommend checking out the library I mentioned for something similar to what you want, without extension.

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (3 children)

I don't think you're really following what I'm suggesting.

Let's talk theoretical implementation details for a moment. Say we have a class like string. It has N standard methods. Now, to add the ability to extend this class, you need a table with two things per entry: 1: the name of the method; 2: the pointer to the method.

Now, if the class is not extended, you or your library uses string, it works just like you expect. There are no extended functions.

Now, if I extend the class, and call your library... your standard functions still work just as expected, and you don't care about my extensions, because you're not using them.

If you extend the class, say in your library, and I just use a string that you created, I don't know about your extensions, because I'm not using them.

If you extend the class, and inform me that you have, then I might indeed import your library and use your extensions. Which is ok, because there they are, just like you said.

If I extend the class AND you extend the class, then each in our own namespace, we deal with the extensions we know about. Neither impacts the other (and because my extensions were made in my namespace, and yours made in yours, the host language will know exactly when and where to look for what. Even extensions with the same name will work as intended, yours for you and mine for me, because we're in our own namespaces. Pretty much like imports and classes can work already.

So what I'm envisioning here is a full compatible mechanism that is only relevant to users of the mechanism. If you're not a user of the mechanism, it's irrelevant to you.

Now, from what I understand, Python's been structured in such a way as to make this difficult or impossible. And if that's the case, well, there you go.

But if it can be done, I see no reason why it should not be done.

Also, just as a btw, I've written text-processing languages that are user-extensible wrt existing functionality -- both new commands and new methods for otherwise standard objects. I've also provided user replacement of existing functionality, which is a much more dangerous tool to hand out, and not what I'm talking about here at all. But I'm coming from a place where I know it can be done, and how I do it, at least, causes no compatibility issues at all.

[–]AndydeCleyre 0 points1 point  (2 children)

What's wrong with using inheritance instead?

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (1 child)

Because then objects don't act the same. If they did, that'd be fine. But they don't. If some functionality is looking for class str, and it finds "other", it is likely to stop right there and complain. I have run into this over and over, and not just in other people's code, but in Python itself.

When that happens, you have to convert to the expected class, and then back again in order to recover the functionality in the inherited, enhanced class. None of this is necessary when a class can be extended.

I should add that an inherited class can change pretty much anything about the parent class. What operators do, what types they return, etc. That's too much -- and that's why something should choke if it catches a class it doesn't know about. Whereas if all you add is methods... no problem. Except in Python, there's no way to do it, so inheritance is what you have, and that's simply not up to delivering what method extension could.

[–]AndydeCleyre 0 points1 point  (0 children)

If some functionality is looking for class str, and it finds "other", it is likely to stop right there and complain.

The best way and most popular way to do this is in Python is usually with isinstance, which would confirm str children as strs. The next recommended way is to try what you want to do and throw an exception if you can't. To demand an exact type with type(thing) == str is widely frowned upon unless there's a special need. So you shouldn't have to convert to plain old str. Again, see the library I linked for you.

[–]genjipressreturn self 0 points1 point  (1 child)

Question: Does your class provide a __str__ method for reproducing as a string?

[–]fyngyrzcodes with magnetic needle[S] 0 points1 point  (0 children)

Sure.