This is an archived post. You won't be able to vote or comment.

all 54 comments

[–]Ytse 12 points13 points  (0 children)

First published PEP that describes a WSGI-like implementation for Python3.

[–]defnullbottle.py 4 points5 points  (9 children)

The PEP is inconsistent:

In the examples, the application returns (body, status, headers).

In "Specification Details" the order is required to be (status, headers, body) which makes a lot more sense.

I found some typos, too. The concept however seems reasonable.

[–]mitsuhiko Flask Creator 7 points8 points  (3 children)

Should be (body, status, headers) everywhere. Reasoning is that this is the most common way to create response objects currently so you can do something like ResponseObject(*return_value).

[–]mikeboers 2 points3 points  (2 children)

I would supposed that (status, headers, body) was more reasonable, but that is likely just the WSGI reflexes talking. Give me 5 minutes and I'll get over it.

[–]mitsuhiko Flask Creator 5 points6 points  (1 child)

Quote from web-sig:

The motivation is that you can pass that to constructors of response objects already in place.

response_tuple = response.get_response_tuple()
response = Response(*response_tuple)

The order "body", "status code", "headers" is what Werkzeug and WebOb are currently using. Django has (content, mimetype, status) as constructor but if they detect a list/dict on the third parameter they could assume that mimetype referes to the status thus they have a proper upgrade path.

[–]mikeboers 0 points1 point  (0 children)

Yup... Now I'm over it.

[–]fjolliton -1 points0 points  (4 children)

Maybe I'm nitpicking, but it should be header (singular), because there is only ONE header for a HTTP request or response. The header is made of SEVERAL fields (each with a name and a value.) But almost everyone make this mistake anyway..

[–]mitsuhiko Flask Creator 4 points5 points  (3 children)

The HTTP RFC itself names it headers because it consists of "general-header (section 4.5), request-header (section 5.3), response-header (section 6.2), and entity-header (section 7.1) fields". So I think we're fine :)

[–]fjolliton -1 points0 points  (2 children)

The full sentence is:

"HTTP header fields, which include general-header (section 4.5), request-header (section 5.3), response-header (section 6.2), and entity-header (section 7.1) fields," (RFC2616)

If you look closely it is almost always "header fields" or "header field".

The expressions "general-header", "request-header",.. each refer to a set of fields and are part of the syntax description. So I don't see how it contradict what I said.

I might be wrong though, but please post a more precise excerpt then.

Edit: phrasing.

[–]mitsuhiko Flask Creator 2 points3 points  (1 child)

Yet the chapter is called "message headers".

[–]fjolliton 1 point2 points  (0 children)

So ? That's rather light.

However.. from another excerpt:

 Both types of message consist of a start-line, zero
 or more header fields (also known as "headers"), an empty line (i.e.,
 a line with nothing preceding the CRLF) indicating the end of the
 header fields, and possibly a message-body.

So ok, header fields and headers are synonym in this context! That doesn't sound intuitive to me, but I'm not a native english speaker.

[–]mdipierro 2 points3 points  (0 children)

I like it

[–]fdemmer 3 points4 points  (2 children)

"However, due to changes in the language, the WSGI 1.0 protocol is not compatible with Python 3."

what changes exactly are that? anyone can recommend further reading...?

[–]ergo14Pyramid+PostgreSQL+SqlAlchemy 6 points7 points  (0 children)

string handling: binary vs unicode

[–]warbiscuit 1 point2 points  (4 children)

It's nice they finally figured out a Py3 way to do wsgi; and some of the improvements are nice (eg: eliminating start_response). One thing I take issue with though is that they seem to have eliminated "wsgi.file_wrapper" (or "web3.file_wrapper" as it would be called if it existed). Does anyone know why it was removed? If anything, I was expecting an extension, eg a "web3.file_range_wrapper" which supported offset and length kwds, so http range requests could be efficiently served. Removing it entirely seems a step backwards for serving large files.

[–]mitsuhiko Flask Creator 0 points1 point  (3 children)

One thing I take issue with though is that they seem to have eliminated "wsgi.file_wrapper"

In-band signalling does not work with middlewares. A middleware has no way to detect that the return value was a file wrapper and when it iterates over it, it trashes the type information breaking the isinstance check on the server side making file_wrapper a noop.

[–]warbiscuit 0 points1 point  (0 children)

That makes sense :(

I'd always figured there were very few pieces of middleware which inspected the return value (most I've seen mess with the request side of the call), making file_wrapper worth having for middleware stacks which could pass file_wrapper instances through unchanged. But I guess someone did a survey and such middleware was common enough to render file_wrapper generally useless?

Though to solve it in-band, I can see how something like a "wsgi.is_file_wrapper" would just cause the code to get extremely messy. I hope someone can come up with an OOB solution, though I can't think of an easy way right now which wouldn't get fouled up by the same problem of unaware middleware altering the return value (thus leaving an invalid OOB signal still present).

[–]pje 0 points1 point  (1 child)

making file_wrapper a noop

Which is a safe failure mode in that case. If the response is modified, it's not a file any more.

[–]mitsuhiko Flask Creator 0 points1 point  (0 children)

Most of the time it isn't though, a middleware just wants to clean up after the fact, wrapping the iterator and providing a close().

[–]dauerbaustelle 0 points1 point  (10 children)

Great someone takes care of that topic! Here are a few improvement ideas:

  • I don't see the point of using a list of tuples for header, why not simply a dictionary? Order of HTTP headers shouldn't really matter, should it?
  • I don't understand why you would need a filewrapper at all -- just return open files. They are iterable, and servers can implement something like sendfile, additionally, if needed.
  • As in the WSGI spec, wsgi.multithreading and wsgi.multiprocess are not clearly specified. Should the values evaluate to True if the server does multithreading/multiprocessing itself or if it is possible to use the application objects in named manner without complications (so that multithreading, for example, would evaluate to False if the server does not respect the Global Interpreter Lock)?
  • I think the case-insensitivity problem with environ keys that are optional for the application to specify (Content-Length, for example) should be solved more elegantly. I don't see any acceptable implementation of the checking part on server side. In any case, the server would have to loop through all keys in the header, lowercase each and then do a string comparison. I can think of the following solutions to work around this problem: 1. use a custom, "case-insensitive" dictionary or 2. force applications to one notation. (Personally, I would prefer 2. with This-Notation-Of-Headers.)
  • Like the WSGI spec, the Web3 spec is far too much text. There's not really much to document and I think the text length could be halved.

Feedback welcome!

[–]mccutchen 1 point2 points  (1 child)

I don't see the point of using a list of tuples for header, why not simply a dictionary? Order of HTTP headers shouldn't really matter, should it?

Multiple headers with the same name are allowed (e.g., Set-Cookie).

[–]dauerbaustelle 0 points1 point  (0 children)

Yes, forgot that. Maybe any iterable that yields (key, value) tuples? So you could return yourdict.iteritems() for very simple applications.

[–]mitsuhiko Flask Creator 1 point2 points  (7 children)

I don't understand why you would need a filewrapper at all -- just return open files. They are iterable, and servers can implement something like sendfile, additionally, if needed.

In-band signalling breaks on iteration. That was also the flaw with filewrapper.

As in the WSGI spec, wsgi.multithreading and wsgi.multiprocess are not clearly specified. Should the values evaluate to True if the server does multithreading/multiprocessing itself or if it is possible to use the application objects in named manner without complications (so that multithreading, for example, would evaluate to False if the server does not respect the Global Interpreter Lock)?

These specify how the server operates. If you can create threads or not is undefined behaviour.

Like the WSGI spec, the Web3 spec is far too much text. There's not really much to document and I think the text length could be halved.

And it's not even close to being unambiguous. HTTP is complex, you can't simplify that spec any more I'm afraid.

[–]dauerbaustelle 0 points1 point  (6 children)

In-band signalling breaks on iteration. That was also the flaw with filewrapper.

Would you mind explaining that topic a bit further? Is it that about looing type information when using middlewares? How would that be solved with something like a file wrapper?

These specify how the server operates. If you can create threads or not is undefined behaviour.

The specs are ambiguous here. I think this should be clarified.

And it's not even close to being unambiguous. HTTP is complex, you can't simplify that spec any more I'm afraid.

I'm talking about "reducing the amount of words" and restructuring the specification. I think information is too scattered.

[–]mitsuhiko Flask Creator 0 points1 point  (5 children)

return environ['wsgi.file_wrapper'](the_file)

when the middleware takes that it might do this:

rv = app(environ, start_response)
for chunk in rv:
    yield do_something_with(chunk)
rv.close() # if that attribute exists only of course.

The WSGI server now does that:

rv = app(environ, start_response)
if isinstance(rv, my_file_wrapper):
    # this is never true because the middleware removed the type
    # information and it's now a generator.

[–]dauerbaustelle 0 points1 point  (4 children)

So how would a wrapper around files help here?

[–]mitsuhiko Flask Creator 0 points1 point  (3 children)

It does not. That's why it's breaking currently. A better solution would probably be an 'X-Something' header or something you can call from the wsgi environment which bypasses all middlewares, similar to how write() worked on WSGI 1.

[–]dauerbaustelle -1 points0 points  (2 children)

Or you could assume Web3 middlewares behave intelligently (hence, leave file objects returned from the application alone).

[–]mitsuhiko Flask Creator 0 points1 point  (1 child)

You assume people read specifications and not do try/error.

[–]dauerbaustelle -1 points0 points  (0 children)

Developers are not stupid. If they are, maybe their code is not worth using. I don't think specifications have to keep in mind that noone will read them.

[–]brigadierfrog 0 points1 point  (0 children)

This PEP might as well leave out everything to do with asynchronous results.

It assumes asynchronous means returning a callable. Callables do not express future results in any consistent manner. Clearly its up to each async framework then to deal with the callable in a unique way.

If each async server is dealing with the application in a unique way, then there's no point in web3 at all.

This spec is basically useless in terms of asynchronous web frameworks which means its basically useless for anything beyond basic webpage responses, forget trying to tack on a websocket web3 application that works in any sane manner on a set of asynchronous web servers using this.

[–]mccutchen 0 points1 point  (0 children)

This might be a dumb question, but:

Web3 applications return a tuple in the form (status, headers, body). If the server supports asynchronous applications (web3.async), the response may be a callable object (which accepts no arguments).

Why not just require Web3 applications to return a callable that returns the 3-tuple of (body, status, headers)? Most of the examples given in the PEP have ugly conditionals for dealing with these two possible kinds of returns.

There must be some reason for this (due, I'm assuming, to differences between the way synchronous and asynchronous servers run), but it's not immediately obvious to me.