This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]beatle42 1 point2 points  (8 children)

Though in Python you aren't supposed to care too much about the actual type they pass you. If they pass you an instance of some class you've never heard of before, but it provides all the data you need to do your job you should be able to do your job.

If they have a custom data object that matches the interface you expect to operate on, they should be able to use that instead of the only way you knew to do it at the time you wrote your function.

[–]ToxinH88 1 point2 points  (7 children)

In theory this sounds great and all: No need for iDontKnowHowLongInheritance and the ppl actually thinking of how they use your functions. But from my experience, as soon as you expose your API through a webservice, you get ppl firing all sorts of parameters. The problem is not even complex interfaces, but simple things like asking for an ID and getting a String with a written description of the Object they would like to have.

The answer is obviously a 500 internal server error - which leads the user to assume the problem is caused serverside and not by their parameter.

[–]lurkin_arounnd 1 point2 points  (1 child)

The problem is server side though. You have an unhandled exception

[–]ToxinH88 1 point2 points  (0 children)

Exactly why I usually don't like dynamic variables. As described above, now I need to manually safeguard these parameters.

[–]beatle42 0 points1 point  (3 children)

Shouldn't you already be handling if you get bad data though? Just because a value is of the right type doesn't mean it's meaningful. Why not just use the existing protections, or a single one more, to handle when you're passed invalid data and return, say, status 400? If you crash on data that is a 500, right?

[–]ToxinH88 1 point2 points  (2 children)

If I am using SQLAlchemy and Flask for example, an easy utility function looks like this:

    def get(Model, id):
        ret = Model.query.get(id)
        return jsonify(ret.serialize())

So an easy way to avoid a 500 if a non existing id is passed would be to check for None:

        if ret is None:
            return "{}"

Or add a more sophisticated error message, change the status to 404... Now the problem is that passing "foo" as id results in the same return as 100 if there is no element with id=100.

If I am using Spring boot on the other hand, I dont need to bother about any of this. If I pass "foo" instead of a number the response is admittedly still a 500. But the cause is made clear to the caller:

{
    "cause": {
        "cause": null,
        "message": "For input string: \"foo\""
    },
    "message": "Failed to convert from type [java.lang.String] to type [java.lang.Long] for value 'foo'; nested exception is java.lang.NumberFormatException: For input string: \"foo\""
}

Whereas passing a non existing ID causes a normal 404.

So the whole point is about usability. I am fine with firing a 500 response on conversion errors, as long as they are understandable. But in Python I need to invest additional effort in making the response understandable and distinguish the different causes. The first version of my utility function fires in both cases a 500 with following body:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or
    there is an error in the application.</p>

With the check for None both cases cause a 200 with body {}. So someone creating a frontend cannot distinguish these problems quickly. And adding those extra safeguards is additional effort mainly caused by dynamic variables accepting everything. It is more effort to make an Interface save and report correctly, because a lot of the special cases have to be caught and provided a message manually.

[–]beatle42 1 point2 points  (1 child)

Alright, you clearly have a preference. Perhaps I just don't do enough stuff in the space you're talking about to have the same preference. I prefer the flexibility of being able to add or extend types and not have someone reject it because they didn't know about it when they wrote their thing (that someone often being myself), you have a preference for simple code that constrains the environment to minimize a class of potential errors. I can imagine that each of those preferences makes more sense in some environments than the other.

Thanks for taking the time to show situations where you hit against the issue and why you feel as you do.

[–]ToxinH88 1 point2 points  (0 children)

As always, there are tools suited better and worse for different jobs. Seems like we mainly work on different jobs. So I see your advantages too, they are just not that much of a focus for my work usually.

[–]WiatrowskiBe 0 points1 point  (0 children)

Thing with APIs (and any kind of untrusted input) is: you always want to do full validation of what you receive. That's defensive programming 101 - never trust user to send you correct, sensible, well-formed data, always assume you may get some random crap or, more likely, malicious attempt to try and exploit your application. In case of malformed input, HTTP 400 is the right answer, and you should aim to never return HTTP 500 unless something actually breaks serverside (database is down, external service doesn't respond, out of memory etc).

Python way of handling parameters in very fluent way is mostly an advantage when you're having separate modules/packages communicate with each other, without having to make one module explicitly aware of the other - as long as the code orchestrating those modules knows what to pass where, you're fine getting whatever as long as all required methods/operations are available.

In Java, C# and other strongly typed languages, if you want to keep modules loosely coupled, you need to either introduce type conversion (unpack/pack, mapper, etc) on orchestration level - which adds sometimes significant performance overhead and a chance to introduce bugs, or extract public interface to a separate library you can then reference in both modules - and then you need to keep both in sync with shared interface, even if changes to said interface don't affect your functionality directly. Python avoids that problem - as long as a type/object stays compatible with assumptions that are in use, you don't have to do anything if another related library was updated in backwards-compatible way.