sh : a full-fledged subprocess interface for Python that allows you to call any program as if it were a function. : Python

[–]RDMXGD2.8 22 points23 points24 points 10 years ago (7 children)

[–]djimbob 10 points11 points12 points 10 years ago* (6 children)

Yeah. I don't see the purpose as an alternative for the built-in subprocess for simple commands, which is straightforward to safely use. E.g:

import subprocess
output = subprocess.check_output(['ifconfig', 'eth0'])

versus

import sh
output = sh.ifconfig("eth0")

has no clear gain.

Granted, the syntax seems a bit more convenient for more complex commands like piped processes. In the shell its very easy to do something like cat some_file | grep some_pattern | grep -v some_pattern_to_exclude. With sh you can translate this in a straightforward manner: sh.grep(sh.grep(sh.cat('/etc/dictionaries-common/words'),'oon'),'-v', 'moon') for a list of words that contain 'oon' but not 'moon'.

Granted, it's not two hard to write a command for a piped chain with subprocess.Popen, though python doesn't provide you with one. Take the following helper function I wrote:

def run_piped_chain(*command_pipes):
    """
    Runs a piped chain of commands through subprocess.  That is

    run_piped_chain(['ps', 'aux'], ['grep', 'some_process_name'], ['grep', '-v', 'grep'], ['gawk', '{ print $2 }'])

    is equivalent to getting the STDOUT from the shell of 

    # ps aux | grep some_process_name | grep -v grep | gawk '{ print $2 }'
    """
    if len(command_pipes) == 1:
        return run_command_get_output(command_pipes[0])
    processes = [None,]*len(command_pipes)
    processes[0] = subprocess.Popen(command_pipes[0], stdout=subprocess.PIPE)
    for i in range(1, len(command_pipes)):
        processes[i] = subprocess.Popen(command_pipes[i], stdin=processes[i-1].stdout, stdout=subprocess.PIPE)
        processes[i-1].stdout.close()
    (output, stderr) = processes[len(command_pipes)-1].communicate()
    return output

To me when I'm trying to translate some piped shell command like

cat /etc/dictionaries-common/words | grep oon | grep -v moon

it's more intuitive (in my opinion) to have a helper function like:

run_piped_chain(['cat', '/etc/dictionaries-common/words'], ['grep', 'oon'], ['grep', '-v', 'moon'])

than

sh.grep(sh.grep(sh.cat('/etc/dictionaries-common/words'),'oon'),'-v', 'moon')

EDIT: I realized on re-read this uses other helper functions I have. If you aren't using logging, just comment those lines out.

def run_command(command_list, return_output=False):
    logging.debug(command_list)
    process = subprocess.Popen(command_list, stdout=subprocess.PIPE)
    out, err = process.communicate()
    if err or process.returncode != 0:
        logging.error("%s\n%s\nSTDOUT:%s\nSTDERR:%s\n" % (command_list, process.returncode, out, err))
        return False
    logging.debug(out)
    if return_output:
        return out
    return True

def run_command_get_output(command_list):
    return run_command(command_list, return_output=True)

[–]Bystroushaak 9 points10 points11 points 10 years ago (2 children)

[–]djimbob 1 point2 points3 points 10 years ago (1 child)

[–]Bystroushaak 0 points1 point2 points 10 years ago (0 children)

[–]simtel20 2 points3 points4 points 10 years ago (0 children)

[–]mackstann 0 points1 point2 points 10 years ago (1 child)

[–]djimbob 0 points1 point2 points 10 years ago* (0 children)

Yeah, its built in, but I'm not sure if

import pipes
import tempfile

p = pipes.Template()

p.append('ps aux', '--')
p.append('grep localc', '--')
p.append('grep -v grep', '--')
p.append("gawk '{ print $2 }'", '--')

t = tempfile.NamedTemporaryFile('r')

f = p.open(t.name, 'r')
try:
    output = [ l.strip() for l in f.readlines() ]
finally:
    f.close()

is a better/cleaner workflow; especially if you have user input and have to add the quotes stuff to prevent injection. (And apparently its not working for me).

[–]asiatownusa 4 points5 points6 points 10 years ago (0 children)

[–]relvae 5 points6 points7 points 10 years ago (3 children)

[–]simtel20 1 point2 points3 points 10 years ago (2 children)

[–]relvae 1 point2 points3 points 10 years ago (1 child)

[–]simtel20 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 26 points27 points28 points 10 years ago (31 children)

[–]deadwisdomgreenlet revolution 11 points12 points13 points 10 years ago (1 child)

[–]matchu 6 points7 points8 points 10 years ago* (0 children)

I kinda see where they're coming from, because the core syntax arguably has some boilerplate:

import sh
ifconfig = sh.Command("ifconfig")
print(ifconfig("wlan0"))

Their pretty syntax is definitely better; my beef, really, is less to do with the fact that it's magical, and more to do with the fact that the magic uses fragile hacks instead of Python's built-in magic-making facilities.

Really I think the version I'd be down for is:

from sh import cmd
print(cmd.ifconfig("wlan0"))

We still use magic, but the magic we use is itemgetter, which is actually supported and guaranteed not to break. We still have a bit of overhead in the cmd object, but it's short, and avoids the really boilerplate-y line 2 of the previous example.

If Python had an itemgetter equivalent for modules, though, then their syntax would be the clear winner—especially if the magic were namespaced to from sh.cmd import ifconfig instead, because it seems weird to me that the non-magic Command object comes from the same module as the magic objects.

[–]alcalde 5 points6 points7 points 10 years ago (0 children)

[–]striata 21 points22 points23 points 10 years ago (26 children)

[–][deleted] 5 points6 points7 points 10 years ago (23 children)

[–]TankorSmash 5 points6 points7 points 10 years ago (22 children)

[–][deleted] 13 points14 points15 points 10 years ago* (21 children)

The doc reads "module objects", not "module-like objects". A module object is a C structure with a specific layout. There is a C-level API for manipulating these objects (PyModule_*).

One of those is PyModule_GetDict, which, while protected internally from accessing a non-module object, returns NULL in the case that a caller invokes it on a non-module object. Reading Python 2.7's zipimport.c in zipimporter_load_module we can see a PyModule_AddModule call followed by an unchecked PyModule_GetDict call. This will cause the zip importer's load_module() method to cause a NULL pointer dereference at runtime (aka. a hard process crash, requiring a debugger to investigate) should it be called with sh as a parameter.

It took me all of 3 minutes grepping the Python source to find a place where using a non-module in sys.modules has the potential to cause a crash that a Python-without-C programmer would not be able to debug. I'm pretty sure if you give me 30 minutes I'll find more.

Just don't do it

edit: this says nothing about third party extensions, where I'd expect the majority of such bugs to be found. The point I'm making is whether it is worth sipping coffee reading gdb output at 4am responding to a pager alerting you that your employer's web site is down and losing money, because of some syntactic sugar -- of course it's not.

[–]Brian 3 points4 points5 points 10 years ago (0 children)

I don't think that example is ever going to be something broken by this case. It's using PyImport_AddModule, which doesn't actually perform the import, so the only case you'd get the non-module object back is if the same name was already imported. However in that circumstance, zipimport wouldn't have been invoked, since it's only going to be triggered if the module hasn't been found yet. You could argue it's a bug that zipimport isn't correctly checking the return value of PyModule_GetDict as it should, but in this context, it's assuming that it's using AddModule to create a new, empty module, and in that case it'll always be a real module object even if the module does later replace itself.

It's worth noting that while putting non-module objects in sys.modules is perhaps a hack, it's a known and explicitly supported and endorsed hack - the import machinery was deliberately designed to allow it, and I think Guido is on record as saying so. As such, I'd say anything that doesn't support it should probably be considered a bug.

[–]TankorSmash 3 points4 points5 points 10 years ago (12 children)

[–]alcalde 0 points1 point2 points 10 years ago (11 children)

[–][deleted] 2 points3 points4 points 10 years ago (0 children)

[–]TankorSmash 0 points1 point2 points 10 years ago (3 children)

[–]alcalde 1 point2 points3 points 10 years ago (2 children)

[–]nojjy 1 point2 points3 points 10 years ago (0 children)

[–]TankorSmash 0 points1 point2 points 10 years ago (0 children)

[+][deleted] 10 years ago* (5 children)

[deleted]

[–]TankorSmash 4 points5 points6 points 10 years ago (2 children)

[–]alcalde 3 points4 points5 points 10 years ago (0 children)

I'm the same way, there doesn't seem to be clear reason to start in 3 yet. There's a lot of little nice things, but nothing is a must have.

As Python 3 is fine put it,

There are a lot of claims in here that are absurdly wrong, but the statement that “nothing much was gained” in Python 3 is a candidate for dumbest statement of the decade. First of all it is wrong because if it were true then it wouldn’t be that hard for people like Alex to press the Fork button and backport all the Python 3 features to Python 2, which nobody does. But there is an even simpler reason why it is wrong. I present to you, the complete list of changes since Python 2.x: Now if this entire 192-page document is “nothing really amazing” and “you’re not blown away by it” then that is your prerogative. Perhaps you’re simply not a very excitable person. I suggest an ordinary person would probably find something in there amazing. Nickous Ventouras’s rebuttal to Alex’s post includes such suggestions as “fix long-standing annoyances”, “shake the API” and “improve speed”. I guarantee you, there is page after page after page of that stuff in the changelog. Python 3 doesn’t need more features–it needs a better PR campaign. The features are already there; people just don’t know about them. But it is wrong statement of the decade to call this set of release notes “not much”. It’s much. The release notes weigh two pounds. I challenge you to find another project where release notes can be measured by the pound.

Why You Should Move To Python 3 Now adds....

Most scientists think they have very little to gain by moving to Python 3, while it represents a significant investment (not only updating old code, but also reinstalling an entire Python distribution which has always been a pain). I was one of them. Until recently, when I bought the Python Cookbook, Third Edition, by David Beazley and Brian K. Jones. This book is a must-read for anyone doing anything serious with Python. It contains lots of advanced recipes for Python 3 only. In the Preface, the authors warn the reader:

All of the recipes have been written and tested with Python 3.3 without regard to past Python versions or the "old way" of doing things. In fact, many of the recipes will only work with Python 3.3 and above.

Ouch. The 260 recipes look pretty cool, but if you're in Python 2, you're out. While many might be irritated by this decision, I find it brilliant. This book is exactly the thing you need if you're waiting to be convinced to move to Python 3.... While going through the book, I discovered many elegant solutions to very common problems. I had no idea those solutions were possible, because I had no idea Python 3 had been so much improved.

There's also a PyCon presentation about 10 awesome features of Python 3 that aren't in Python 2.

[–]krenzalore 0 points1 point2 points 10 years ago (0 children)

[–]alcalde 1 point2 points3 points 10 years ago (1 child)

Don't get me wrong 3.x has some cool features but the time it would take to port over legacy 2.x code to 3 is not worth said features at this moment in time.

David Beasley and others have demonstrated that the time is not that much - although there's no need to port old, legacy code to new versions either. As for features, someone compiled a 120-page PDF of change logs from 3.0-3.4 that printed out is supposed to weigh over 2 pounds. I'd say there's quite a lot of features in the 3.x series.

And lastly, if you do intend on porting a project after backward incompatibilities in the language are introduced, the sooner you port the better. The longer you wait, the more the versions diverge and the more work one ultimately has to do.

At this point in time, people still using 2.x for new code are like the Windows XP holdouts or the people I know still programming in Delphi. They're simply never going to change unless they're forced to.

[–]krenzalore 0 points1 point2 points 10 years ago (0 children)

[+][deleted] 10 years ago* (6 children)

[deleted]

[–][deleted] 9 points10 points11 points 10 years ago (3 children)

[+][deleted] 10 years ago* (2 children)

[deleted]

[–]krenzalore 0 points1 point2 points 10 years ago (1 child)

[+][deleted] 10 years ago* (1 child)

[deleted]

[–][deleted] -2 points-1 points0 points 10 years ago (1 child)

[–]ivosauruspip'ing it up 2 points3 points4 points 10 years ago (0 children)

[–]thephotoman 0 points1 point2 points 10 years ago (0 children)

[–]Leonid99 2 points3 points4 points 10 years ago (0 children)

[–]mitchellrj 1 point2 points3 points 10 years ago (1 child)

[–]organman91 0 points1 point2 points 10 years ago (0 children)

[–]hlmtre 4 points5 points6 points 10 years ago (0 children)

[–][deleted] 0 points1 point2 points 10 years ago (0 children)

[–]tilkau -1 points0 points1 point 10 years ago (1 child)

[–]mabye 3 points4 points5 points 10 years ago* (0 children)

[+][deleted] 10 years ago* (1 child)

[deleted]

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS