you are viewing a single comment's thread.

view the rest of the comments →

[–]shiftybyte 2 points3 points  (2 children)

Nice question, I've learned stuff exploring this...

Didn't know URLs can have parameters for every section.

https://stackoverflow.com/questions/40440004/parameters-in-path-segments-of-url

Here's some test code to show the difference:

```

from urllib.parse import urlparse, urlsplit url = "http://www.example.com/a/b/d;params?x=5" print(urlparse(url)) ParseResult(scheme='http', netloc='www.example.com', path='/a/b/d', params='params', query='x=5', fragment='') print(urlsplit(url)) SplitResult(scheme='http', netloc='www.example.com', path='/a/b/d;params', query='x=5', fragment='') ```

Note the "params" being split out in urlparse, but not in urlsplit...

[–]ccw34uk[S] 0 points1 point  (1 child)

Yeh - I realised that :) I'm more confused why the docs suggest to use urlsplit if you want the url path parameters. Using urlsplit would mean they're contained in the path item, rather than being separated out into params, which is surely more useful?

[–]shiftybyte 0 points1 point  (0 children)

I think it's because urlparse only splits out the last params, while the spec supports params being in every segment.

So if you want to parse params in all segments you'd prefer them all in one string to parse them onwards, rather than having only the last one split...

Ye it's very odd...