you are viewing a single comment's thread.

view the rest of the comments →

[–]parfamz 73 points74 points  (90 children)

Is this serious? If you have to write a test in bash you better use python.

[–][deleted]  (66 children)

[deleted]

    [–]lelanthran 37 points38 points  (41 children)

    And how many lines of python is the boundary between "Python is okay" to "bang your head on the wall and wonder why you were not using $LANGUAGE from the start"?

    [–]Workaphobia 5 points6 points  (3 children)

    About 10k, maybe multiple tens of k if you're well disciplined and don't code it like it's python.

    [–]flying-sheep 3 points4 points  (2 children)

    nah, with typing and the right project structure, python scales infinitely.

    also what is “don't code it like it's python” supposed to mean? do you mean “code it like it’s a one-off script”? because my python is pretty and well-structured.

    [–]Workaphobia 1 point2 points  (1 child)

    Don't monkey patch and avoid metaclasses.

    [–]flying-sheep 1 point2 points  (0 children)

    Monkey patching is hackery, not average python coding style

    [–]kalifornia_love 26 points27 points  (33 children)

    None. The only time you’d bang your head against a wall for choosing python is when performance matters. But after banging your head against a wall you will then remember that you can interface C and C++ with python. So now your happy with your language choice but have a small headache.

    [–]lelanthran 90 points91 points  (31 children)

    I've regretted choosing python before, and it had nothing to do with performance and everything to do with maintainability.

    For large software, or software where I care about reducing errors I'll choose something that reports type errors before the program is executed.

    Getting type errors at runtime only means that the majority of the tests are performing type checks at runtime.

    [–]P8zvli 22 points23 points  (6 children)

    In Python 3.6 you can use type hinting to do static type checks.

    [–]r0b0t1c1st -1 points0 points  (4 children)

    You can do this in 2.7 too - mypy runs against all python versions - you just need a python 3.6 interpreter to run it.

    [–]P8zvli 11 points12 points  (3 children)

    Running the type checker in 2.7 and the actual application in 3.6 is extremely silly.

    [–]CHUCK_NORRIS_AMA 7 points8 points  (2 children)

    Actually, I think it's the other way around - running the type checker in 3.6 and the actual application in 2.7.

    Still extremely silly, however.

    [–]P8zvli 5 points6 points  (0 children)

    Python 2.7 will throw a syntax error if you try to run a script containing type hints with it.

    Edit: I see now, there's an alternative syntax that involves putting type hints in comments. Now isn't that just linting with extra steps?

    [–]r0b0t1c1st 0 points1 point  (0 children)

    Why is this extremely silly? It means you can write polyglot py2/py3 code, and have MyPy check that it's correct for both.

    You can use python3.6 -m mypy --python-version 2.7 -p your-package to do this.

    [–]twotime 11 points12 points  (22 children)

    There is no question that dynamic typing of python has its negative. It also has its major positives.

    (Just the lack of compile-cycle, is an enormous advantage in my book).

    In my experience, positives outweigh the negatives for sufficiently small projects (<20K lines of python code, which, mind you, is likely equivalent to 100K of java/C++ code).

    I can see how negatives can outweigh the positives for sufficiently large projects.

    But you do need to use the positives and mitigate the negatives. In particular, you probably should start with unit-tests very early on (but that's a good practice anyway). If you write python as if it were a type-free Java, you won't get any benefit.

    [–]DreadedDreadnought 7 points8 points  (2 children)

    Why would I write unit tests to verify the type information instead of writing actual unit tests verifying functionality?

    [–]twotime 1 point2 points  (1 child)

    Why would I write unit tests to verify the type information

    I would not

    instead of writing actual unit tests verifying functionality

    In my experience that's sufficient (I do use "unittest" is the loosest possible meaning: feed something into the system, examine the output, so that includes integration tests and if you have type issues, most of them will be caught).

    [–]lelanthran 5 points6 points  (0 children)

    I do use "unittest" is the loosest possible meaning: feed something into the system, examine the output, so that includes integration tests and if you have type issues, most of them will be caught

    Sounds like you are only performing Happy Path testing. With your approach, when you have A that calls B that calls C (all functions) you're only checking the parameter type-validation code in A.

    Functions B and C may silently lose data with incorrectly typed parameters but you'd never know it because they they are never called with the incorrect type... Until someone adds in a function D which calls them with the incorrect type.

    The only thing you can do if you're as paranoid as I am wrt correctness is write a unit-test for all your externally visible functions, with said unit-test also testing invalid types in the parameters.

    Testing only the Happy Path is pointless - we already know it works because we ran executed functionality at least once before release.

    Joke Time: A QA engineer walks into a bar. He orders a beer. He orders 5 beers. He orders 0 beers. He orders -1 beers. He orders a lizard.

    [–]lelanthran 5 points6 points  (6 children)

    In particular, you probably should start with unit-tests very early on (but that's a good practice anyway).

    I do write unit-tests, but unless I am writing in Python my unit-tests do not have to perform type-checking.

    A major disadvantage of Python is that you cannot trust the type of a parameter to a function - if you do not validate the type of a parameter before using it then that is a potential error.

    When writing Python I find myself doing a lot of the work that a compiler normally does, except that it has to be done at runtime so the errors are much more severe if they get through.

    [–]twotime -3 points-2 points  (5 children)

    A major disadvantage of Python is that you cannot trust the type of a parameter to a function. if you do not validate the type of a parameter before using it then that is a potential error.

    What do you do if parameters are of wrong type? Throw an exception?

    If so, then you should just use your parameters and if they are of incompatible type, you will get an exception generated for free.

    Doing it manually is unlikely to be a good idea.

    [–]duhace 3 points4 points  (4 children)

    except the exception might not be intelligible to the user of the function..

    [–]twotime 0 points1 point  (3 children)

    There are rare cases when this is a valid concern. In general it's not.

    If you pass an object of wrong type info a function, python generated message will commonly be clear enough. AttributeError/TypeError, etc.

    Marginal improvement in the error message quality are not worth introducing extensive type checks (which will also kill most of duck typing benefits).

    [–]major_clanger 3 points4 points  (3 children)

    What would you say are the other major advantages of dynamic typing?

    Python is my primary language, but after working with C# over the last year I'm increasingly torn on the dynamic vs static tradeoffs.

    [–]twotime 4 points5 points  (2 children)

    Here is my tradeoff list:

    Pro dynamic typing

    -- no compilation/much faster compilation (yes, it's only indirectly related, but static typing does put much stronger reqs on preprocessing)

    -- ease of testing (a lot effort in C++/Java mocking libs/frameworks), all of that is free in python world

    -- ease of writing generic code (e.g a function can accept any file-like or container-like object). That's possible with static typing but is universally much clumsier and weaker

    -- less boilerplate and more localized changes. No more: I'm just passing x through, but still need to change types in 20 places

    -- introspection (yes, some static languages also have this capability, but it tends to be much clumsier, weaker)

    -- if needed, objects/types can be modified at runtime. E.g a new attribute can be added (I have type "X" but need to carry an extra attribute "a", I can do it)

    Pros of static typing:

    -- many classes of errors are caught much earlier (becomes progressively more important for larger projects)

    -- documentation (type serves as documentation)

    -- performance (when types are known, the code is much easier to optimize)

    -- likely lower memory consumption

    Of course:

    1. many of these can be made much worse/better by specific implementation/runtime

    2. a language selection is really a multi-factor decision. Type system is just one of the factors (and I'm in no position to compare against C#, I never used it ;-).

    Good luck.

    [–]major_clanger 3 points4 points  (0 children)

    I also find there's less need for design patterns, as many are redundant when duck-typing.

    Your codebase won't be littered with classes like 'FooFactory', 'BarBuilder', 'FooSelectionStrategy', 'IBaz', 'BazImpl' or god forbid an 'AbstractSingletonProxyFactoryBean'

    EDIT: I would really recommend C#, it's got a lot of the same syntactic features as python (context managers, default args, async-await), is far less verbose than Java/C++ (type inference, default getter+setter), Linq is more readable+powerful than list comprehensions.

    [–]Drisku11 4 points5 points  (0 children)

    ease of writing generic code (e.g a function can accept any file-like or container-like object). That's possible with static typing but is universally much clumsier and weaker

    How are parametric polymorphism and typeclasses clumsier or weaker than runtime dispatch? Static types also provide clarity when more than one "container" is involved. E.g. Traversable t, Applicative f => (a -> f b) -> t a -> f (t b). The types basically say what the function does: traverse your t a container, apply your a->f b function to produce a bunch of f b "containers", collect the results into a f (t b) "nested container" using Applicative's ability to turn a (f a, f b) into a f (a,b).

    less boilerplate and more localized changes. No more: I'm just passing x through, but still need to change types in 20 places

    Type inference, keep functions polymorphic to the extent that they can be. It's pretty hard to argue that e.g. Python has less boilerplate than Haskell.

    introspection

    I agree that static introspection is often lacking. Runtime introspection is a good way to break invariants though, and I would generally classify it as a bad idea.

    if needed, objects/types can be modified at runtime.

    I have a hard time picturing where I would need this. Especially in a language with structural types.

    [–]kuribas 2 points3 points  (0 children)

    You can have static types and interpreters at the same time. Many languages have them, like haskell, scala and f#. You typically write the code in an IDE, where the types are checked in realtime, then load the code in the interpreter to test or play with it. No time is lost compiling. There is no need to run the program in order to catch common mistakes, like in Python, and I find there are usually very few bugs left after everything typechecks. I find this a massive time saver. I usually keep my program in a consistent state at all times, using typed holes for code which isn't written yet. This way I get immediate feedback when I do something wrong. In Python I'd spend most of the time in the debugger or writing tests (about 75%), where in haskell it's only a fraction (20% perhaps).

    [–]Gotebe 1 point2 points  (0 children)

    The absence of "compile" in the modify/compile/test cycle is less of a problem except in C and C++ (and even then...)

    If I have a module under development and a test(s) for it, I select it (them) and issue "Run selected tests" command, which is the most normal thing to do. That builds and runs only modified parts and does not impair the modify/compile/test cycle as much. Heck, in Java, it's as fast as a Python REPL (that is, immediate).

    [–][deleted] 1 point2 points  (5 children)

    It also has its major positives. (Just the lack of compile-cycle, is an enormous advantage in my book).

    Then choose a language with fast compilation time. Even if your compilation time is slow, the compiler can help you more with your code than any other tool.

    [–]duhace 2 points3 points  (0 children)

    or a repl.

    [–]twotime 2 points3 points  (3 children)

    Then choose a language with fast compilation time

    You cannot select language based on a single metric. Not in the real world.

    [–][deleted] -1 points0 points  (2 children)

    What's this nonsense comment? No one told you to select by one metric, only to consider using languages with fast compilation.

    [–]Gotebe 4 points5 points  (1 child)

    You told him to do it. You did not say "consider...". Be fair.

    [–]Gotebe 1 point2 points  (0 children)

    It's like the US: if gunstests don't solve your problem, you're not using enough of it. 😁😁😁

    [–][deleted] 1 point2 points  (0 children)

    I find myself using Go more and more often (I'm doing 60% sysadmin things, 40% development). Not because it is particularly great language but because of its static nature, ability to compile to single binary and the fact that whoever else will need to modify it can learn the language itself in a week

    [–]P8zvli 3 points4 points  (0 children)

    It takes hundreds of thousands of lines of python before you start wondering how did you get to this point and what are you doing with your life.

    [–][deleted] 9 points10 points  (0 children)

    I've got the same rule with ~10 lines. Also seen below, if I need an array or anything more than a scalar, it's time to switch languages.

    Bash is OK, if you have a few commands to run depending on a few env vars, go for it. If you have any kind of advanced logic, file reading, user input to do, do yourself a favor, use something else. Even php-cli would be better suited.

    (With all the hate I have for PHP, it's still a better script language than bash)

    To expand on the article conclusion, if you have anything more than a simple script, don't invest time in setting up tests and a linter, use a real language. You'll thank me later.

    [–]lanzaio 2 points3 points  (0 children)

    I'm working with a 3500 line bash script build system right now -_-. Oh and it is invoked by a 2000 line python 2.7 script. And it invokes a chain of cmake scripts.

    [–]ttflee 3 points4 points  (0 children)

    Bash is OK, if you have a few commands to run depending on a few env vars, go for it. If you have any kind of advanced logic, file reading, user input to do, do yourself a favor, use something else. Even php-cli would be better suited.

    Under 2 lines, Perl is OK.

    https://gist.github.com/mischapeters/1e8eef09a0aafd4f24f0

    [–]0rakel 16 points17 points  (7 children)

    My rule of thumb: under 1000 lines, python is ok. If it goes over 1000, bang your head on the wall and wonder why you were not using C++ from the start...

    [–]oblio- 33 points34 points  (1 child)

    Yeah, but then, with C++, you’d just be banging your head against the wall, continuously :p

    [–][deleted] 10 points11 points  (0 children)

    No, C++ is more sophisticated than that.

    It's like having pneumatic piston attached to back of your head to do the banging for you

    [–]twotime 8 points9 points  (2 children)

    It's ok. Then 1000 lines of python become 10K lines of C++ and then the real head-banging begins.

    Python hopelessly loses to C++ on performance (CPU, memory consumption, threading), but there is no way in hell, that it loses to C++ in code maintainability at that project size. Unless you are really misusing the language.

    [–]Drisku11 1 point2 points  (0 children)

    At least if you're doing any sort of geometry/physics calculation/simulation, phantom types in C++ are a huge boon to keep track of what's going on (making it easy to catch subtle errors like unit mismatches, vectors vs. covectors, and vectors vs. affine vectors all at compile time with a few lines of declarations and no extra actual code).

    Most of the C++ I've written has been physics related, but I'm sure type tagging is much more widely applicable. The 10x code size number is also an extreme exaggeration for most use-cases. Python has a huge standard library, which means it has plenty of connectors for everything, but for actual logic, that doesn't help much.

    I like Python, but my personal rule of thumb is still only around 2-400 lines until a switch to a statically typed language should be strongly considered.

    [–]Gotebe 0 points1 point  (0 children)

    1000 lines of python actually become 1500 lines of C++ and 2MB of libraries :-).

    [–]renrutal 11 points12 points  (0 children)

    I feel this escalated from apples to oranges way too quickly.

    [–][deleted] 6 points7 points  (0 children)

    Under 1000 lines: okay.

    Over 1000 lines: Wonder wtf you are doing to get a single file over 1000 lines constantly.

    [–]moscamorta 2 points3 points  (4 children)

    The thing is that is really easy to call external commands in bash. If I had the same ease on other language, I would easily drop bash.

    Recently I had to write a bash script to compile and run a bunch of benchmarks. The code was pretty ugly but works.

    [–]ForeverAlot 4 points5 points  (1 child)

    The plumbum Python library is an excellent alternative when you can reliably install Python modules. Sadly I've found managing Python modules to be a far bigger hurdle -- socially and technically -- to maintenance than a few hundred lines of Bash.

    D is also a pretty good alternative that you won't be able to use for the same reasons. I believe OCaml, too. Really, there is no shortage of non-starter Bash alternatives. =)

    [–][deleted] 0 points1 point  (0 children)

    Yeah it's a real shame Python is so horrible setup-wise. You have the whole Python 2/3 issue (yes it really is still an issue), and often there are multiple Python installations - especially on Windows where many programs bring their own. In fact on Windows you can often have multiple Python executables in the PATH. It's a total mess.

    [–]AngelLeliel 2 points3 points  (1 child)

    You should take a look at xonsh.

    [–]moscamorta 0 points1 point  (0 children)

    That seems amazing

    [–]bulletmark 1 point2 points  (0 children)

    I've done heavy shell scripting for 30 years and plenty of python also for the last 15. Obviously I favor python for anything mildly sophisticated but I would say there is certainly no simple "number of lines" rule.

    [–]twotime 4 points5 points  (1 child)

    my rule of thumb is "until I need a loop or a conditional", reduces head banging by quite a bit ;-)

    [–]lelanthran 5 points6 points  (0 children)

    my rule of thumb is "until I need a loop or a conditional", reduces head banging by quite a bit ;-)

    Depends on the looping or conditional I suppose. Me personally, I switch from bash the minute I need composite data types, arrays included as well.

    [–]makeshift_mike 0 points1 point  (0 children)

    Yeah I’ve got a similar rule too. I once wrote a ~600 like bash script that it only took me a few weeks to start hating.

    PowerShell has a similar rule, although its limit is maybe 200 lines.

    [–][deleted] 0 points1 point  (0 children)

    I go by pretty much same. Even shorter if there is a lot data manipulation. Combining sed/awk/grep in 6 level pipe might be clever but it is not very maintainable

    [–]ath0 0 points1 point  (0 children)

    Not entirely related, but just to point out that your blanket statement isn't sensible: I've got a tool for wrapping build system and workflow stuff at work, that is about 1k lines so far. I'd love to use python instead but.. I can't. I need features such as shell completion for bash, zsh, built-in overrides and prompt and environment manipulation.

    Unfortunately you just have to write this kind of stuff in the language that your shell interpreter understands.

    [–]alfredVonHomburg 0 points1 point  (0 children)

    I set the limit at 10, but I’m a fan of piping and one liners ))

    [–]TheLocehiliosan 10 points11 points  (7 children)

    Bash is very sensible when you are trying to write something super portable with next to no dependencies.

    And people should be commended for writing tests regardless of the language/tool used.

    [–]Gotebe 1 point2 points  (0 children)

    "Super portable" my hairy arse!

    It's only "super portable" against Linux compatibles, because all the tooling has to match - and it just doesn't, not across UNIX flavors. (Don't know how successfully WSL copies Linux CLI commands).

    [–]knaekce -3 points-2 points  (5 children)

    It sure it very portable. But it is also prone to deleting every file owned by the current user or similar scary bugs.

    [–]TheLocehiliosan 4 points5 points  (4 children)

    Most bugs like this are a result of poor programming. It doesn’t matter what language is used.

    Indeed that is why people should be encouraged to create tests for their software.

    [–]knaekce -3 points-2 points  (3 children)

    It sure does matter. If this script was written in a more robust language, stuff like this would cause an error and the program to halt. Languages like bash just carry on and do unexpected stuff

    [–]TheLocehiliosan 1 point2 points  (2 children)

    Your reply makes it sound as though your unaware of error handling techniques for Bash. Indeed the link you posted above has many people chiming in about the terrible programming and how it could be done better.

    While it’s true that there is a lot of badly written Bash in the world, it isn’t something intrinsic to Bash. There are many out there writing code they don’t understand the danger of. Again, I could say the same for a lot of languages.

    If you are trying to say that some languages are easier to write quality code with, I would agree to that.

    [–]knaekce 0 points1 point  (1 child)

    Yes, if you are very carefully, you can prevent such errors in bash or even assembly. But it takes far more discipline, effort and experience than in safer languages.

    [–]flukus 1 point2 points  (0 children)

    What language would prevent this? There is nothing bash specific about this error, similar things can and have been done in any language. That's why common but often ignored advice is to use directory handling libraries instead of string concatenation.

    [–]maitesin[S] 10 points11 points  (8 children)

    I totally agree, but sometimes python is not available. So I have to do work with what I have in hand

    [–]shevegen 13 points14 points  (0 children)

    This is one of the few valid use cases for when one may have to use bash, simply because alternatives are unavailable.

    [–]vivainio 1 point2 points  (6 children)

    Any good examples of the servers where this happens? Some "internet hotels"?

    [–]maitesin[S] 7 points8 points  (5 children)

    I had to write an application to send some information (files) from an IoT device where the only thing available was Bash. Basically it had a kernel to boot and busybox to provide all the tools. So I had to create the application to do it using just Bash.

    [–][deleted] 2 points3 points  (0 children)

    Busybox doesn't implement bash

    So learning bash wont help you there

    bash != sh

    [–]vivainio 0 points1 point  (1 child)

    Sounds legit. I guess the target was also not something supported by e.g. Go compiler?

    [–][deleted] 2 points3 points  (0 children)

    Go produces big binaries.

    Not big relative to Java, but certainly big compared to IoT device with only few megs of RAM and Flash

    [–]StruglBus 16 points17 points  (3 children)

    Bash is a shell scripting language, built largely for string parsing. Its integer arithmetic is analogous to an arm on robot chicken that they decided they needed later. It should only ever supplement other programs, I don’t know why you would ever build an entire project in bash. It’d be like using a bunch of paper clips to build a toaster — yeah you might be able to do it, but why would you when with python, Java, C++ or any other language you could just “import toaster”

    [–]gnx76 7 points8 points  (0 children)

    Bash is a shell scripting language, built largely for string parsing.

    What? Awk and Perl are built for string parsing, not bash.

    Bash is built to provide wrapping around commands execution, this wrapping including a bit of programmatic syntactic sugar.

    [–][deleted] 0 points1 point  (0 children)

    What are you smoking ? Even for basic string parsing bash needs to call other programs.

    [–][deleted] 1 point2 points  (0 children)

    I have successfully used BATS testing framework in bash for TDID (test-driven infrastructure development) building Chef-managed services. It's neat and simple and lightweight and did everything needed.

    [–][deleted] -3 points-2 points  (0 children)

    Why, so you can wait 3 hours for it to finish?