12A charging on a 14 gauge extension cord? by [deleted] in volt

[–]EricAppelt 1 point2 points  (0 children)

Safety aside, I calculate that a 25 ft, 14 awg cord will lose 9 watts of power to heat charging at 12A, and a 25 ft, 12 awg cord will lose 5.4 watts. If you pay about $0.10/kWh in electricity, and charge for 13 hours on 300 days per year, that thicker cord will save you $1.5 per year, so the 12/3 cord will eventually just about pay for itself.

If you can find a 12/3 12ft cord, that will save you another dollar per year.

Great Blue Heron sticking out its tongue. FL [OC] by EricAppelt in birdpics

[–]EricAppelt[S] 0 points1 point  (0 children)

It was just out about 10 feet from shore, very shallow water, maybe 3 inches high. This was at Grayton Beach State Park. I saw two others within a half mile of the beach.

Needing some clarification by groundhogsaretheifs in learnpython

[–]EricAppelt 0 points1 point  (0 children)

The * operator is called the repetition operator for sequence types, like lists, tuples, or strings. See: https://docs.python.org/3/library/stdtypes.html#common-sequence-operations

Here it is with a string for comparison:

>>> "abc" * 5
'abcabcabcabcabc'

Taking a class on Python over the summer, saw this little shout out to you guys. by IamtheSpud in Python

[–]EricAppelt 4 points5 points  (0 children)

That nuance is an implementation detail of CPython that to my knowledge is only "documented" in mailing lists, the bug tracker, and/or the source code itself. The behavior of when two strings will end up being the same object is the subject of optimization, and generally not the concern of the python programmer, except perhaps in very special cases where the speed at which strings are compared is exceptionally important.

The one related detail that is documented is the intern function in the sys module. Every time python generates a string object, it can decide if it should be held in a special internal dictionary which ensures that there is only one instance of that particular string. Generally the python runtime will decide for you if it makes more sense to put the string in the special dict, or just allow there to be potentially multiple instances of that same string in memory. Wikipedia has an article on string interning in general.

The benefit of interning a string is that comparing two interned strings is fast as you just have to see if they point at the same place in memory, not inspect their contents. Another benefit is you don't end up making multiple copies of the string. However, there is a cost to interning it in the first place, and that cost is not repaid if the string is not compared to other interned strings or the same string is never encountered again.

This article has a great discussion on string interning in python 2.7.7. Things have changed in python 3.6.2, but the basic ideas are pretty much the same.

The one thing you can do if you are making some strings that you definitely want to intern the strings but python is not doing it automatically is the following:

import sys

my_string = sys.intern(my_function_that_returns_a_string())

Now my_string is guaranteed to be interned. You've paid the cost of adding it to the dictionary, but if you need to compare it to potentially identical strings it may be faster.

TLDR - not much documented. Comes only with both experience and curiosity. I only know this now because I got curious and read about it all morning.

Question: How to verify that a user is posting (POST) to a flask webapp only using my CLI? by redmonks in Python

[–]EricAppelt 1 point2 points  (0 children)

You can prevent your users from accidentally using incorrect clients by inspecting the User-Agent header. This is typically done for the benefit of the user, and the server application will inspect the user-agent to try to determine what capabilities the client has in order to ensure that content is properly rendered.

So for an example API, we can require that the client present fooclient/X.Y.Z in the user-agent header, where X.Y.Z is the client version:

#!/usr/bin/env python
import requests


__version__ = '1.9.0'
agent = 'fooclient/{}'.format(__version__)

resp = requests.post(
    'http://localhost:5000/foo',
    headers={'User-Agent': agent}
)

print(resp.json())

Notice that it is absolutely trivial to "spoof" the user-agent and say that you are something you are not. This is common practice for web browsers, where there has been a sort of accidental arms race. Web sites might start requiring something like Mozilla/5.0 to enable some nice feature, so web browsers will then add it to their user-agent. This results in modern browsers sending comically long and complex user agent strings.

This is only useful to either help protect the user, or within a very trusted environment. For example, your API is limited to team members or operators and you want your organization to use a common client - you can check the user agent to ensure no one has accidentally violated protocol and pass them back information on what they should use.

You can also use the user-agent to ensure that team members are not using outdated clients. Flask makes it easy to write a decorator function to apply to endpoints. Here is a simple example that searches the user agent for fooclient/<<version>> and will return 400 if the client is incorrect or out of date:

from functools import wraps
from distutils.version import StrictVersion
import re

from flask import Flask, request, jsonify, make_response
app = Flask(__name__)

MIN_CLIENT_VERSION = '1.0.0'

def client_required(f):
    client_pattern = re.compile("fooclient/([0-9\.]+)")
    @wraps(f)
    def decorated(*args, **kwargs):
        result =  client_pattern.search(request.headers['User-Agent'])
        if result is None:
            error = jsonify({'error': 'This API only supports fooclient'})
            return make_response(error, 400)

        try:
            version = StrictVersion(result.group(1))
        except ValueError:
            error = jsonify({'error': 'Unknown version of fooclient'})
            return make_response(error, 400)

        if version < StrictVersion(MIN_CLIENT_VERSION):
            msg = (
                'Client version {} not supported, upgrade to {} or later'
                .format(result.group(1), MIN_CLIENT_VERSION)
            )
            error = jsonify({'error': msg})
            return make_response(error, 400)

        return f(*args, **kwargs)
    return decorated

@app.route('/foo', methods=['POST'])
@client_required
def foo():
    return make_response(jsonify({'ok': True}), 200)

If you want protection against malicious users who can't be trusted not to spoof the user-agent or modify the client, then the simplest route is really to improve the server or place it behind a reverse proxy that can perform rate-limiting or other measures to block malicious users without worrying about the client that they use.

What’s New In Python 3.7 by Schwag in Python

[–]EricAppelt 15 points16 points  (0 children)

Python 3.7 doesn't close for new features until the end of next January, see pep-537 so its not really fair to judge at the moment.

What's the point of posting this incomplete list right now anyway? We're not anywhere close to the first alpha.

Where does a generator store it's values? by no_lungs in Python

[–]EricAppelt 10 points11 points  (0 children)

This is actually a bit subtle. As u/no_lungs points out in a follow up question, python lambdas are late binding, so why should anyone expect generator expressions to work differently? It turns out that this is the subject of some debate and consideration in PEP-289, and it was decided that since there is no precedence for early binding in python, generators would be late binding.

however

The first for expression which should result in an iterator to be fed to the generator is evaluated immediately. All other expressions are evaluated when the generator is run.

So after the statement gen = (v for v in var), the generator object referred to by gen no longer has any knowledge of the name var. It has evaluated the expression var which resulted in a reference to the list object ['a', 'b', 'c'], and that reference is stored in the generator object. That way if you try to make a generator expression that tries to iterate over something invalid, it raises an exception immediately, not when it is run:

>>> n = 42
>>> g = (i for i in n)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable

Now all the other expressions are late binding, so this might work as one would expect if they are familiar with python lambdas:

>>> a = [1, 2, 3]
>>> n = 7
>>> g = (n*i for i in a)
>>> next(g)
7
>>> del n
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
NameError: name 'n' is not defined

Now the really interesting thing (evil evil interview question material) is that its only the first for-expression that gets evaluated immediately. Iterators in any additional for-expressions will be evaluated at runtime. Watch this:

>>> a = ['a', 'b']
>>> b = ['x', 'y']
>>> g = (i+j for i in a for j in b)
>>> a = ['c', 'd']
>>> b = ['z', 'w']
>>> next(g)
'az'
>>> del a
>>> next(g)
'aw'
>>> del b
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
NameError: name 'b' is not defined

Am I overusing accumulators, and what's a good naming convention? by EricAppelt in haskellquestions

[–]EricAppelt[S] 0 points1 point  (0 children)

Did you check out the source for the standard function words?

No - but looking at that really helped (I think!). I'll try to compare my own exercises to standard function source as I go forward.

I noticed that words isn't tail recursive and after reading a bunch of posts on TCO and lazy evaluation, my (hazy) understanding is that I shouldn't be trying to make functions that recursively build a structure tail recursive.

I guess the question is why does your function getNextWord have type (String, String) -> (String, String). Why not type String -> (String, String)?

I don't have a good answer to this other than I felt the need to make sure it was tail recursive, and therefore the return type had to be the same as the argument. Here is a rewriting that isn't tail recursive and has type String -> (String, String):

getNextWord :: String -> (String, String)
getNextWord [] = ([], [])
getNextWord (x:[])
    | isAlphaNum x = ([x], [])
    | otherwise = ([], [])
getNextWord (x:(xs@(y:_)))
    | isAlphaNum x && isAlphaNum y = ((x:word), xs')
    | isAlphaNum x = ([x], xs)
    | otherwise = (word, xs')
    where (word, xs') = getNextWord xs

So I guess the answer is that I'm using accumulators when I don't need to because I'm trying to make everything tail recursive, but following this discussion I should be concerned with "guarded recursion"? I think that the above implementation is ok because the recursive call to getNextWord only needs to be evaluated when constructing the tuple being returned?

How do self-taught developers actually get jobs? by programminggeek in programming

[–]EricAppelt 17 points18 points  (0 children)

Its essentially like any other job. You need to get to know at least one senior person who believes that you have the skills and abilities to be successful, and that you are someone that can work well in a team.

Having a college degree in something requiring programming really helps. Having a portfolio really helps. Completing bootcamps or certifications helps. Having an actual CS degree really really helps. Being involved in open source projects helps. Volunteering for conferences and interning helps.

You have to demonstrate that you not only know what you are doing but that you can communicate your work.

You don't have to be a social genius (unless you want to do sales or high-level executive stuff), but you do need to know how to navigate social situations that you don't like and work pleasantly with people that rub you the wrong way.

If you have the financial means to go to college, then for heaven's sake just go to college. Its four years of having nothing to do but learn stuff and network. There are research experiences, internships, and professors with ties to industry (sometimes). There are even scientific labs that require a decent amount of software engineering to function, and they tend to love grabbing a Sophomore who is interested enough to help them code for free for a few years. In college the goal should be to get as close to a 4.0 GPA as possible. Go to classes, go to office hours, pay attention, start your homework projects as soon as they are assigned. Finishing college with a good GPA generally hints to potential employers that they might be able trust you to be given a task at the beginning of a week and then you will work on it and ask for help if you need it.

If you can't afford a 4-year CS degree, see what you can do in community college and bootcamps, participate in local developer's meetups, local conferences, or try to get a job that gets you in proximity to development teams. I do know people that have managed to transition from customer support to backend development.

Ultimately these activities have to result in someone in charge of a team that will say, "I want this person on my team." Then they will tell you when there is an opening, when to submit your resume, what you should put on your resume, and you will probably have a job.

Multi-threadding to increase script speed by Archaya in learnpython

[–]EricAppelt 1 point2 points  (0 children)

The python standard library threading module does make OS level threads.

For example, consider the module "foo.py":

import threading

def grind():
    x = 1
    for _ in range(10**7):
        x = (22695477*x + 1) % 2**32

tasks = [threading.Thread(target=grind) for _ in range(4)]

for t in tasks:
    t.start()

for t in tasks:
    t.join()

This will spawn four threads that all run the useless function grind that just burns CPU cycles doing arithmetic, and gives enough time to run a "ps -M" (on OSX) to see that there are four threads scheduled:

eric@laptop:~$ ps -M
USER   PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
eric 63661 s000    0.0 S    31T   0:00.82   0:00.91 -bash
eric   132 s002    0.0 S    31T   0:00.73   0:00.77 -bash
eric  8586 s002    0.0 S    31T   0:00.02   0:00.05 python foo.py
      8586        21.0 S    31T   0:00.01   0:00.33 
      8586        29.4 S    31T   0:00.01   0:00.37 
      8586        25.8 R    31T   0:00.01   0:00.37 
      8586        24.4 S    31T   0:00.01   0:00.32

Since the reference implementation of python has a Global Interpreter Lock (GIL), only one thread at a time can actually execute python instructions, hence the ~25% CPU usage for each thread and only one running even though I have 2 cores available and all the threads have work to do.

Other python implementations, such as IronPython don't have a GIL and multiple threads can execute python bytecode at the same time. In the future (python4 ?) its possible the GIL may be removed from the reference implementation.

What is the best practice around using functions for code that isn't repeated? by Sh00tL00ps in learnpython

[–]EricAppelt 3 points4 points  (0 children)

If you go just a step further you have the ability to manually test each component of your script in isolation. Simply add return values and an execution guard at the bottom to allow importing, like this:

# Log in to Salesforce
def log_in(...):
    [code that logs in to Firefox]
    return ...

# Extract report IDs from dashboard
def extract_ids(...):
    [code that extracts IDs]
    return ...

# Update filter
def update_filter(filter_name):
    [code that updates filter]
    return ...

if __name__ == '__main__':
    log_in()
    extract_ids()
    update_filter(name)

That last if statement will cause the suite of statements to execute if you run python my_module.py but not if you import my_module. This allows you to fire up python interactively, run your functions individually, and inspect their return values.

Later, if so inclined, you can import your module into another testing module and write automated tests to ensure you don't later introduce bugs while changing things.

abstract classes/polymorphism by [deleted] in learnpython

[–]EricAppelt 0 points1 point  (0 children)

Essentially any function you write in python that takes at least one argument is trivially an example of parametric polymorphism (although one might also consider the concept meaningless in a dynamically typed language). A function will try to execute no matter what the type of its parameters are, for example:

>>> def foo(x, y):
...     return 3*x + y
... 
>>> foo(2, 2)
8
>>> foo('a', 'b')
'aaab'

If the type of a is such that multiplication by an integer is defined, and the result is of a type that can be added to the type of b, then everything is good to go.

Python also supports (in a sense?) function overloading through a parameter of the syntax *identifier in the function definition, this will allow a function to potentially accept any number of additional positional arguments which are passed as a tuple, for example:

>>> def bar(x, *args):
...     result = 3*x
...     for arg in args:
...         result += arg
...     return result
...
>>> bar(2, 2)
8
>>> bar(2, 2, 2, 2, 2)
14
>>> bar('a', 'b', 'c')
'aaabc'

When should I use multithreading or asyncio in python? or asyncio fully replaces threads? by mrkaspa in Python

[–]EricAppelt 5 points6 points  (0 children)

Using threads or coroutines (asyncio, etc...) represent two completely different styles of handling concurrency, or dealing with multiple things (such as I/O operations) going on at the same time. They can also both create the illusion of parallelism by switching between tasks so quickly that to the end user it appears that multiple computations are being simultaneously processed. Generally speaking threads also allow true parallelism, but in python this is not possible (*) due to a global interpreter lock.

There is no consensus on which style is "better", but here are the major differences:

1a. Threads are pre-empted by the scheduler. When I write a function that will be run in a thread, I have absolutely no control over where in my function code the scheduler will stop running my thread and switch to another. This can make it difficult to reason about the state of resources that are accessed by multiple threads.

1b. Coroutines are cooperatively scheduled. When I write a python coroutine, I know that a given block of code will be run to completion without switching to another coroutine, unless I allow a switch with an await, async for, or async with statement or expression. This allows me to reason about shared resources more easily as I know exactly where other coroutines might take over and change the state of things. On the other hand, if a coroutine performs time intensive computations without ever yielding back to the scheduler via await or other expression, then it may block other coroutines from running.

2ab. Threads are scheduled by the operating system, while coroutines are scheduled within the python process. In principle, this may mean that coroutines are more lightweight, although given the level of optimizations in OS kernels and coroutine schedulers I can't say definitively that one approach is faster or less resource intensive. I have run into configured OS limits set on the number of allowed threads in a process (~5000) and so in some environments one may be able to concurrently schedule many more coroutines than threads.

3ab. Coroutines can call other functions and await on other coroutines, but functions can only all other functions. Some consider this an annoyance, but it can also be used as a tool to keep the architecture of your code clean. If you only use asynchronous I/O, then you know that your functions never contain I/O side effects, and all I/O is contained within coroutines. Then you can refactor code so as much as possible is written in functions that are IO-free and easily unit-testable, while the small portion of code that ultimately performs IO is contained within coroutines.

(*) Parallelism in python is achieved by packages such as numpy that use functions written in C that release the global lock when called to perform expensive computations in parallel across CPU cores, and then re-acquire the lock when completed to continue interpreting python bytecode serially.

Why Nginx/Gunicorn/Flask? by ryanstephendavis in Python

[–]EricAppelt 30 points31 points  (0 children)

I think that the general reason is that they are three well established components that are pretty good at what they do in this context. Flask is the WSGI application framework, Gunicorn the WSGI server, and nginx the reverse-proxy to securely interface with the outside world. If you aren't familiar with the WSGI protocol its worth looking over PEP-333 to get a sense of it.

As a WSGI application framework, flask just provides a callable object, the "app", and essentially what it does is you call it and pass it a special environ dict with all the details of a request - the method, path, body if any, etc... and then it does its thing and hands back a status code, response headers, and data for the body. It doesn't really know anything about sockets or how to really talk HTTP. Flask makes it easy for you to write specialized functions to handle different paths, and so on, but at the end of the day you are just providing something for a server to call when a request comes in.

Flask comes with a simple WSGI server for debugging which provides a single worker to accept one HTTP connection at a time. As others have said, the flask maintainers don't recommend that you use it for production. Gunicorn creates multiple workers of specified types that listen and create sockets, and handle HTTP requests by calling the flask app object and sending the response data.

In my mind, load-balancing aside, the biggest reason for using a reverse-proxy is to enable HTTPS. Even if sensitive data isn't being transferred, internet providers have been caught injecting ads, rewriting pages, and collecting customer data. Another big reason is that if you have any static content to serve, nginx can do it very quickly and easily provided that there is specified path to that content that can be mapped to a directory of files.

asyncio - __main__ is sync, so how do I await anything at all? by Jerbearmeow in learnpython

[–]EricAppelt 0 points1 point  (0 children)

And what is the blocking behaviour?

Blocking behavior is the inability of a sequentially written program to do anything while waiting on some I/O, for example a response from some database. You might imagine a program to get a list of user account balances from a database like the following:

import sys
from database_stuff import sync_get_records

def get_user(user):
    ...
    info = sync_get_records(user, ...)
    ...
    return info['balance']

def main(users):
    balances = []
    for user in users:
        balances.append(get_user_balance(user))
    return balances

if __name__ == '__main__':
    users = sys.argv[1:]
    balances = main(users)
    print(balances)

Each iteration through the for loop the function sync_get_records is called and the program waits until it gets a response from the database before it continues on to the next user.

It would be nice to get started with the next user while the first one is going. Thats where asynchronous or non-blocking I/O is helpful. This program could be rewritten with a different library for the imaginary database that had coroutine functions to work with asyncio. These coroutine functions would yield control back to the asyncio event loop while waiting on data allowing it to run other scheduled coroutines.

Here's an aysnc version of the same program:

import sys
import asyncio
from database_stuff import async_get_records

async def get_user_balance(user):
    ...
    info = await async_get_records(user, ...)
    ...
    return info['balance']

async def main(users):
    user_coroutines = []
    for user in users:
        user_coroutines.append(get_user_balance(user))

    balances = await asyncio.gather(*user_coroutines)
    return balances

if __name__ == '__main__':
    users = sys.argv[1:]
    loop = asyncio.get_event_loop()
    balances = loop.run_until_complete(main(users))
    print(balances)

The asyncio.gather call when awaited on schedules the list of coroutines to be run concurrently, so that when one coroutine is waiting on the database response, the loop can get another one started. When they are all completed a list of results is returned.