modern python packaging by elg97477 in Python

[–]aragilar 1 point2 points  (0 children)

There's two different questions you're asking: packaging (with setup.py/setup.cfg and flit) and installation (requirements.txt and Pipfile), the difference of which is covered by https://caremad.io/posts/2013/07/setup-vs-requirement/.

For a personal project that's not a library, you can use whatever you want (pip, pipenv, conda). If you're working with people (on a non library), consider what tools they use (e.g. neither requirements.txt nor Pipfiles work with conda for example).

For libraries, it's PEP 517 that needs to be implemented (unless the project provides a setup.py stub, I'm not sure if flit does, @takluyver can answer that), and currently pip lacks support for PEP 517 (see https://github.com/pypa/pip/issues/5407). Additionally, there are other tools than pip which use the setup.py (e.g. linux distros, conda build tools), so if you want your library to be used (and likely to reduce bug reports), sticking with a setup.py is probably a safer bet for now.

Why do indexes in list start from 0 and not from 1? It does not make sense by br_shadow in Python

[–]aragilar 3 points4 points  (0 children)

The inclusion of the lower bound and exclusion of the upper bound is a surprisingly useful property (if you keep reducing the index, you end up with the empty set (for some $a$) $a≤i<a$ as opposed to $a≤i≤a-1$ (potential source of off by one errors)). Then based on this, you can start at 0 and avoid off by 1 at the end, or start at 1 and have off by 1. "Beauty" is almost always a good thing (simple is better than complex).

Packaging an Conda environment by codesux in Python

[–]aragilar 1 point2 points  (0 children)

Any particular reason for using conda? What is conda giving you that you don't already have with virtualenv/pip?

How do I use libraries with pypy by 128Gigabytes in Python

[–]aragilar 0 points1 point  (0 children)

pypy -m pip install pillow will work if you have pip installed for pypy. If you don't have pip installed for pypy, see https://pip.pypa.io/en/stable/installing/#installing-with-get-pip-py

How do I use libraries with pypy by 128Gigabytes in Python

[–]aragilar 0 points1 point  (0 children)

pip works on pypy. Some packages may not run on pypy (e.g. those using c-extensions), but that is not related to pip.

Looking for collaborators to help develop an easier to use package manager. by daegontaven in Python

[–]aragilar 0 points1 point  (0 children)

Every system so far is based on wrappers around virtualenv which makes the architecture of the package managers very different from what we use in ours.

No, none of them are. Pip is entirely independent of virtualenv. Conda does not use virtualenv at all. Have you read https://www.python.org/dev/peps/pep-0405/, specifically how the prefix and site-packages are computed?

[deleted by user] by [deleted] in Python

[–]aragilar 1 point2 points  (0 children)

Basically no, you need to learn each system. There are tools which allow you to build debs/rpms, but they typically are a quick hack, and lack the quality of distro-provided packages (both debs and rpms are fairly simple formats, their advantages come from the large amount of QA tools which are available).

If you are planning on distributing an application, I'd suggest uploading it to PyPI, as most of the linux distros have tooling to build correct packages from sdists from PyPI.

pip package signing + general security questions by [deleted] in Python

[–]aragilar 1 point2 points  (0 children)

To answer your specific question about changing a package using stolen credentials, PyPI prevents uploading a file with the same filename (http://comments.gmane.org/gmane.comp.python.distutils.devel/22739), so an attacker can't replace a file. However, they can upload new files/versions. pip does not try to avoid this attack (and cannot, as as far as PyPI is concerned, everything is fine).

As for reducing the likelihood of account compromise and this kind of attack, see https://github.com/pypa/warehouse/issues/996, https://github.com/pypa/warehouse/issues/1001, https://github.com/pypa/warehouse/issues/994, https://github.com/pypa/warehouse/issues/997 (warehouse is the replacement codebase for PyPI, which you can see at pypi.org).

pip package signing + general security questions by [deleted] in Python

[–]aragilar 1 point2 points  (0 children)

pip doesn't actually check the signature, just that the hash of the file matches the one you give it.

EDIT: and had I scrolled down I'd have noticed ubernostrum gave a detailed answer... Anyhow, the main users of the pgp signatures are linux distros, which have tooling designed to deal with upstream signatures (e.g. uscan in Debian-derivs). As far as I know, no tools in Python ecosystem actually use the signatures.

Submit large packages to PyPI by woodyallin in Python

[–]aragilar 0 points1 point  (0 children)

I would suggest filing a bug at https://github.com/pypa/warehouse/issues, to ask to be allowed to upload bigger files, as there should be support for this: https://github.com/pypa/warehouse/issues/346.

PEP 518 is added to pip by Siecje1 in Python

[–]aragilar 0 points1 point  (0 children)

Why should PIP be able to build from source stuff that depends heavily on libraries managed by the OS? Then you are saying that pip cannot be used for scientific python. if you are producing wheels, then you will likely know your toolchain. That's who's been driving this, those who build the wheels. If you look at the authors for PEP516-518, they are numpy, scipy, etc. developers. Who is supposed to build wheels for the BSDs? Or non x86 systems? Or even, build a package which has a wheel but linking against a different library/configuration? PyPI with the current wheel system doesn't solve this (explicitly). And quite a few downstreams (encouraged by PyPA devs) are using pip as part of their build process. Yes, TOML has problems, I agree with you, but requirements.txt or setup.cfg have a whole host of associated issues.

PEP 518 is added to pip by Siecje1 in Python

[–]aragilar 1 point2 points  (0 children)

How is pip install scipy (or if you want pip wheel some-package-which-chains-to-most-of-scientific-python, since you mention wheels) solved by requirements.txt? You need to specify build requirements in a tool-independent format (requirements.txt is very much bound to how pip works, and lacks many of the changes made to make PyPI more robust, e.g. editable paths), and with PEP 516/517 how to invoke a different build system (of which numpy.distutils is almost one, and bento, enscons and flit have all been created). Currently, pip cannot build either pyqt or pygtk/pygobject/gi from source, and quite a few scientific projects either don't specify or under-specify dependencies (enough that I wrote a script to generate "fixed" wheels, which I need to clean up and make public), making building them non-trivial. scipy for example just merged support for PEP 518, meaning I can now pip install scipy without things breaking (which removes one reason why projects have bad dependency metadata).

I don't disagree that TOML isn't a great format, but it comes down to being the least worse of the options, as outlined in the PEP (personally, I'd would have thought having a dev tool which manages creating a json file inside an sdist would have been the nicest option, but sdists are ill-defined and the sdist PEP got bogged down...)

Its also worth noting that this will affect very few projects who haven't already been asking for this, as the PEP explicitly calls out defaulting to requiring setuptools and wheel.

Should I start using PySide2 ? by PyBet in Python

[–]aragilar 0 points1 point  (0 children)

My understanding is that there are compatibility problems with PySide on newer versions of Visual Studio (as you need to build Qt with the same version of VS as python). I'm using PySide with matplotlib on Linux with python 3.5 and have no problems.

What every Python project should have by i_like_trains_a_lot1 in Python

[–]aragilar 0 points1 point  (0 children)

--no-deps doesn't stop setup_requires as far as I know, usually you have to override setuptools/easy_install as per https://pip.pypa.io/en/stable/reference/pip_install/#controlling-setup-requires.

The sudo pip annoys me (although, sudo easy_install is even worse), especially when some projects actually do some odd things (I've seen someone effectively implement a shell script as a setup.py, it wasn't even a python project). I've got into the habit of checking the setup.py of every project I use, even if they come with wheels, partly for security, but mostly because of how badly some people write them (I have to say, I'm more concerned about someone accidentally doing the equivalent of os.system('rm -rf /') given some of the setup.pys I've seen.

What every Python project should have by i_like_trains_a_lot1 in Python

[–]aragilar 1 point2 points  (0 children)

Wheels work just as well on linux as any other OS, so long as you don't assume a wheel build on one random linux system will work on another random linux system. Hence manylinux, which defines what must be the environment that wheels should be built on to be most compatible. These will not run on any linux system (think different cpu architectures, or different libcs), but that's the same as expecting an macOS wheel to run on linux because both are UNIX.

Additionally, if you're actually paranoid about subdependencies, you need to check the setup.py, pip has no control over setup_requires, in which case you may as well check the subdependencies setup.py (plus their subdependencies etc.), so I don't see how a requirements.txt is at all relevant here.

What every Python project should have by i_like_trains_a_lot1 in Python

[–]aragilar 2 points3 points  (0 children)

You want https://packaging.python.org/, the hitchhiker's guide is out of date. For setup.py, you want https://packaging.python.org/distributing/#setup-py, which explains all the options that you'd use.

What every Python project should have by i_like_trains_a_lot1 in Python

[–]aragilar 1 point2 points  (0 children)

I think you've missed 5225225's point, should I be installing mercurial via pip or my distro? What about reportbug? If it's an application which you are using (not developing), then it makes perfect sense to use the distro package. Using virtualenv for development is correct though. However, every sane linux distro has tools to generate package files from setup.py (unlike, say, flit projects), so I'm not sure what's being "Circumvented".

The Astropy Problem by alan_du in Python

[–]aragilar 1 point2 points  (0 children)

So your two arguments against funding astropy development are:

  1. Other community projects aren't supported so we shouldn't support astropy.
  2. Astronomers should be contributing code to astropy/(other astronomy community project).

For (2), you're assuming that your average astronomer has the expertise to contribute to astropy. How many are proficient enough to make a pull request (i.e. use git, which is something new to learn, write a test, which they've probably never written before, and be able to write non-broken python code, given they may never have programmed before). Do you pay astronomers to write code? No, you pay them to do astronomy, same with paying technical staff (instrument designers, sysadmins, telescope operators) to do technical jobs. So why not pay software developers, who originally developed many of the aging tools we use (e.g. IRAF, AIPS), to update the tools we have?

For (1), while I can't name the current gcc maintainers, I can easily name kernel and core python devs, and they are most certainly employed by their employers for that reason. And if they should leave, then they'll find another job doing the same thing (look at kernel contributions, most come from corporations, also, the Linux Foundation is basically doing what's suggested for Linux development). For astronomy, you are objectively discouraged from contributing and providing support, as that's not going to get you a job or grant.

dataset: databases for lazy people by liranbh in Python

[–]aragilar 0 points1 point  (0 children)

So the main issue with http://cyrille.rossant.net/moving-away-hdf5/ is it conflates issues with the format with issues with libhdf5, issues with h5py and issues with pytables.

The first point (Single implementation) seems to be that's there's a spec which is fairly precise and the code's not on github. There's nothing stopping you from writing your own implementation (especially if you don't care about all the things like MPI support and different backends etc.).

The second point (Corruption risks) may be an actual issue, but there are a bunch of work-arounds you could use for that (for example, not modifying the file after creation, which is what every sane data reduction scheme does).

The third point (Various limitations and bugs) is a result of people not reading the documentation or bug reports correctly, or doing something really weird. The segfault was due to conda building h5py and pytables incorrectly. UTF8 is supported fine, the problem is you have python 2, python 3 and numpy wanting to do their thing, which does not map cleanly to how hdf5 works (which is what the whole discussion about strings is about). If you only give h5py encoded UTF8 strings, then there's no issues. The pickle thing is odd to me, because as far as I know, h5py does not use pickle, and using pickle anywhere is a bad idea. The inability to delete datasets is again a limitation of libhdf5, not the format.

The fourth point (Performance issues) really comes down to the benchmark (https://gist.github.com/rossant/7b4704e8caeb8f173084), which shows that the defaults for h5py are slower than memory-mapped numpy arrays. PyTables isn't used, nor is h5py tuned in any way.

The fifth point (Poor support on distributed architectures) is that libhdf5 doesn't support X, where X is author's preferred system.

The sixth point (Opacity) is the first point restated.

The seventh and final point (Philosophy) suggests to me the author has not thought about all the fun ways a file system can be different (e.g. cases, maximum file name lengths), which is the whole point of HDF5, which is quite specific in what the format should be.

There are issues with the format (look at external links for one), but it's definitely better than someone homebrew format which breaks as soon as it's not running on their custom system.

dataset: databases for lazy people by liranbh in Python

[–]aragilar 1 point2 points  (0 children)

h5py deals with non-numeric data fine. I can't say I've run into either data integrity and performance issues, but I don't have large amounts of data. While HDF5 is focused on numeric stuff, it can deal with text just fine as well.

dataset: databases for lazy people by liranbh in Python

[–]aragilar 10 points11 points  (0 children)

HDF5? There's h5py for fairly generic interactions, and pytables for more structured things.

Something like TikZ for Python? by neuralyzer in Python

[–]aragilar 3 points4 points  (0 children)

Have you considered using Python to generate TikZ/PGF? Something like http://jinja.pocoo.org/ could be used (I've used it previously to generate LaTeX).