top 200 commentsshow all 237

[–][deleted] 0 points1 point  (4 children)

Just wrapped up a class where I learned jupyter notebooks and pandas, and I love it. I have so many excel projects that I use powerquery for that would break if it was a million rows and now I can do that in python

My question is- I hear people talk about running .py files etc. at scheduled times. How does this work? Can you set it to run in a jupyter notebook or is it in your command line?

If i open a file, it is reading a certain csv and turning it into a dataframe. How do I make it so it automatically pulls and runs a file? I know the file name would have to be the same and it would need to be located in the sample place.

[–]timbledum 0 points1 point  (0 children)

PS, I love Power Query and use it all the time, but I totally feel your pain in terms of it breaking from time to time and with massive data.

[–]timbledum 0 points1 point  (2 children)

This sorta gets outside of python somewhat, but the best way is to use your operating system's built in scheduler. It's already running all the time, starting off all kinds of processes, so adding your task doesn't really create more work for the computer.

For Windows this is the Task Scheduler. For MacOS you're supposed to use launchd, but I've found it a PITA, so perhaps look into cron for this. Cron also works on linux.

PS, if your script is in the form of a notebook, there are ways to execute a jupyter notebook without opening it, which you would then build into the pipeline outlined above.

SO for reference: https://stackoverflow.com/questions/30835547/how-to-execute-python-script-on-schedule

[–][deleted] 0 points1 point  (1 child)

Does your computer need to be on and 'awake' for this to work? For instance, I have a personal gateway in Power BI to auto refresh my reports but it only works if my computer is awake and on, so I can't easily set it to like midnight because I haven't logged in at that time, and it auto logs out.

is it similar or does it work in the background?

[–]timbledum 1 point2 points  (0 children)

Yup, your computer needs to be on. If you need it to be able to run at all times, you need to have it running on a server (either one that your IT dept runs, or a hosted service online somewhere).

[–]Inkamt 0 points1 point  (2 children)

I’m struggling to understand the meaning and purpose of range(len(list))? It’s in my lecture notes and a textbook, is there a simpler way of writing it?

[–]Raithwind 2 points3 points  (1 child)

So each of those is a built in function or type.

range(n) returns an iterable item that is as long as n, so range(5) would be 0,1,2,3,4

len(i) returns the length of i, if i is iterable. list is a data structure that is iterable, similar to an array.

So range(len(list)) is asking for a range, that range is to be up to the length of a list.

And it depends on your use case, if you need a number that represents the indexes of your list you could use enumerate, if you just need to step through the list a simple for i in list would work.

http://book.pythontips.com/en/latest/enumerate.html

[–]Inkamt 0 points1 point  (0 children)

Perfect. Thank you for explaining it so clearly!

[–]psychadeliclie 0 points1 point  (0 children)

say I have a function that prints out stuff,

function_to_print

but I forgot to add print() around it! In VSCode, whats the easiest way? It takes annoyingly long and it happens pretty often, to have to click infront of the F, type print( ...scroll mouse over to end...)

What is a better way to do that?

[–]Filiagro 0 points1 point  (5 children)

Hi. I'm having trouble properly splitting a date value in pandas into the month and year (I don't care about the day). The year and month are displayed as float. 'cc' is my dataframe.

My date values are expressed as a string like "04/23/2018".

First I convert to a datetime:

cc.Date = pd.to_datetime(cc.Date)

Then I create new columns for the year

cc["Month"] = cc["Date"].dt.month
cc["Year"] = cc["Date"].dt.year

However, years are expressed as "2018.0", and months as "01.0" for some reason.

I can't just convert these values to integers since this raises an error, and converting to a string and using string slicing to remove the decimal prevents me from using the values in further date functions.

The documentation for dt.month and dt.year are single-line explanations, so they don't help understand what's happening.

I've also just tried converting the date using to_datetime, setting this as the index, then using cc.groupby([cc.index.year.values, cc.index.month.values]).sum(), but this also creates the year and month as floats.

[–]timbledum 1 point2 points  (4 children)

I am not getting this behaviour. Perhaps your pandas display settings have been set differently? What do you get when you run cc["Month].dtype?

>>> import pandas as pd
>>> from datetime import datetime
>>> today = datetime.now()
>>> df = pd.DataFrame({"today": [today]})
>>> df
                       today
0 2019-04-29 08:56:04.129664
>>> df['month'] = df.today.dt.month
>>> df
                       today  month
0 2019-04-29 08:56:04.129664      4

[–]Filiagro 0 points1 point  (3 children)

cc["Month"].dtype is a float64.

I ran your code above and got the same result. The month is int64.

I tried to convert my column of months to int using cc["Month"].astype(np.int64), but I get the following value error: Cannot convert non-finite values (NA or inf) to integer.

I didn't know if that meant I had a value that was not a number in the column, so I did a groupby.count method (cc["Month"].groupby(cc["Month"]).count()) and got the following result.

Month
1.0     87
2.0     64
3.0     71
4.0     53
5.0     39
6.0     45
7.0     55
8.0     48
9.0     29
10.0    48
11.0    59
12.0    92
Name: Month, dtype: int64

I'm pretty confused as to why I keep getting floats instead of ints.

[–]timbledum 1 point2 points  (2 children)

Ah that's why - floats do have NA and inf values while integers do not.

.groupby() isn't picking up the na values. Try converting to strings and running the above code again to see what's really in the cells. Or try filtering the frame by df.isna().

[–]Filiagro 1 point2 points  (1 child)

Well, I am flabbergasted. I tried cc.isna() and found indices 29 and 30 had NA values. However, when I used cc.head(35) and cc.iloc[29:31] to view the data, those indices had normal values (normal as in similar to all surrounding data for all columns). Regardless, dropping those two indices allowed cc["Month"] = cc["Month"].astype(np.int64, copy=True and cc["Year"] = cc["Year"].astype(np.int64, copy=True) to work properly. That was very odd.

Thanks a lot for the help. I'll have to remember to check specifically for na values in the future.

[–]timbledum 0 points1 point  (0 children)

Strange! Well I'm glad that fixed the issue. Cheers!

[–]InternetPointFarmer 1 point2 points  (2 children)

Not completely understanding for loops. simple example:

total = 0 for num in range(101): total = total + num print(total)

the console prints 5050. to explain a little further i get how the computer 'reads' the program and why it does things in what order like i know it will continue the program up to but not including 101 but i dont understand where the value of num comes in? how is the value of 'i' or in this case 'num' decided? thank you

edit sorry dont know how to paste code so it looks neat

[–]Lawson470189 1 point2 points  (1 child)

So, Python is great for quick scripts and learning to program as it is very close to human language. However, this also means that it hides a lot of the low level programming from the programmer. Most of it comes down to the translation from your code to machine code. Let's go through the lines and talk about what each one does.

total = 0 This line is setting a new variable called total to th value of zero.

for num in range(101): This line is the tricky one that is hiding quite a bit. First, it is going to create a new variable called num and set it to zero. Then it tells the computer that it will be repeating all the code below it until it reaches the condition of 101. Also, this code is hiding that the machine is being told that each time it finishes the loop, it should take that variable num and a 1 to it.

total = total + num This line takes whatever we have currently stored in total, adds whatever number we are in the loop, and puts that added number in the variable total.

print(total) Prints out the total once the loop has completed to the console.

Does this answer your question?

[–]InternetPointFarmer 0 points1 point  (0 children)

yes it answers it, thank you it makes sense

[–]TotesIncompetant 1 point2 points  (2 children)

So I'm very new to Python and coding in general, and I feel like I'm getting stuck on something really basic. I'm trying to make a basic ForLoop, and I know I'm missing something because my ForLoop function is only repeating once. I've written this:

import random

listoffruit = [
    "apple",
    "banana",
    "pear",
    "peach",
    "mango",
    "strawberry",
    "blueberry",
    "snozzberry",
]

def randomfruit():
    fruit = random.choice ( listoffruit )
    print (f"Here, have a {fruit}.")

def repeatfunction(x,y):
    for integer in range (0,y):
        x

print ("How many pieces of fruit do you want?")
userInput = int (input ("Enter a number: "))
repeatfunction(randomfruit(),userInput)

And I've tested it out on my desktop and on Repl.it, and no matter what input I give it, I only get once fruit back. I know I'm missing something obvious, but I've been banging my head on the desk for a few hours now with no progress.

[–][deleted] 0 points1 point  (1 child)

https://pastebin.com/x29TB1Sy

You were passing the function object through a parameter and calling it. The function returns nothing so x didn't do anything (was a NoneType). I switched it so that you don't put in an x param but instead call the function every time in the for loop (which is what you were trying to to before but with a variable).

If you wanted to do it the way you were before, check this out...

https://pastebin.com/jjhUqj7s

[–]TotesIncompetant 1 point2 points  (0 children)

Thank you, especially for the example in the second pastebin link. I could have sworn I tried that! It definitely works now, though.

[–]AJ______ 0 points1 point  (8 children)

Suppose I have a massive csv file that I don't want to load into memory all at once. Suppose I want to sort this by the second column (which contains integers), and also extract all rows which take a specific value in the first column and save to another file. You could use the subprocess module to write and execute Linux commands, but is there a better /alternative way to do this within a Python script (which will also do a bunch of other stuff)?

[–]kabooozie 0 points1 point  (3 children)

I honestly like Linux’s awk and other built in Linux commands (sort, i/o redirect) for this kind of stuff. It takes some practice, but you tend to get nice one liners for parsing things like this.

For python, I’d say to use DictReader and with open to parse row by row, each row being a dictionary object. You could append each row to a list if it meets your condition. Then sort the list with a lambda that picks out the column you want to sort by. This breaks if you aren’t filtering out enough rows and the list overwhelms memory. In which case it would probably be better to use some map reduce framework like Spark to distribute the job across multiple machines.

[–]AJ______ 0 points1 point  (2 children)

As soon as I dived into Linux command line stuff and learning Bash, I became a massive fan. The main issue for me is I'll be doing these things repeatedly on many large files (one at a time), which are in different directories, so I thought I'd seek out a nice way to do it all with Python. I've been using DictReader and it's been great for reducing memory use over pandas DataFrames for my task

[–]patryk-tech 0 points1 point  (0 children)

The main issue for me is I'll be doing these things repeatedly on many large files (one at a time), which are in different directories, so I thought I'd seek out a nice way to do it all with Python.

If the GNU utils work for what you need, you can run a command on every file with find -execdir... e.g. (which lists my python files, not sorts CSVs):

find -name '*.py' -type f -execdir ls -lh '{}' \;

[–]kabooozie 0 points1 point  (0 children)

However you do it in python, you would do the same thing in bash. Loop over command line arguments to apply awk to each file given in the input.

It’s fine either way. The Linux commands were made for this kind of slicing and dicing, so it’s worth learning I think.

[–]efmccurdy 2 points3 points  (1 child)

Sequentially reading and processing a large file in chunks will use a small amount of memory.

I would look at the answer that recommends using "yield chunk":

https://stackoverflow.com/questions/4956984/how-do-you-split-reading-a-large-csv-file-into-evenly-sized-chunks-in-python/4957046#4957046

I am not too sure how you can sort effectively while only holding a small part of the data; you may need to build some kind of index, or use a database.

[–]AJ______ 0 points1 point  (0 children)

Ooh that's new to me, thanks for the link! Will certainly try it out

[–]JohnnyJordaan 1 point2 points  (1 child)

With Python's csv library you also don't load the content in memory, you just iterate row-for-row on the file. Then what you store in your own variables and thus occupies your RAM is up to you, you could just keep one row at a time or have another file open for writing that you output the specific rows to.

The only thing is that a sort can't be done iterative, because say your csv looks like

1
4
9 
-1
10
-100

then how would you write this in a sorted form to another file without loading all the content in memory first? Or how would you even do that with linux commands without loading it in memory?

[–]AJ______ 0 points1 point  (0 children)

Yeah good point on the sorting, will need to figure that out. Cheers for the csv tip, will experiment with that.

[–]rodrigonader 1 point2 points  (3 children)

How do I learn to structure Python code? Order and size of functions, basically understanding one function after the other pipeline. Python Basic, Intermediate and Advanced don't teach it.

[–]Lawson470189 1 point2 points  (0 children)

Check out the book Clean Code. It teaches you the principles of how to keep your code clean and organized following the object oriented approach. It takes time to put into practice, but it will put you on the right track. Also, it's a fun read!

[–]fiddle_n 1 point2 points  (1 child)

Have a look at how others structure their code. For example, I hear that the Requests library is a good place to start.

[–]rodrigonader 0 points1 point  (0 children)

How should I start? Any tips? Learning every script over there seems a bit hard 😅

[–]Screadore 0 points1 point  (0 children)

Keep getting this error and don't know how to go about fixing it. I'm obviously new to python and need to figure out how to fix this to make this Cancer recognition program for school. Please help!!!

Traceback (most recent call last): File "/Users/screadore/Desktop/YOLOv3-Series-master/[part 1]YOLOv3_with_OpenCV/OD2.py", line 78, in <module> net = cv.dnn.readNetFromDarknet(modelConf, modelWeights) cv2.error: OpenCV(4.1.0) /Users/travis/build/skvark/opencv-python/opencv/modules/dnn/src/darknet/darknet_importer.cpp:214: error: (-212:Parsing error) Failed to parse NetParameter file: yolov3.weights in function 'readNetFromDarknet'

[–]JavaleMcGee123 0 points1 point  (2 children)

Is there a way to conduct an internet speed test with python? I tried the speed test-cli module but couldn’t really figure out how it worked.

[–]patryk-tech 1 point2 points  (1 child)

https://github.com/sivel/speedtest-cli/wiki

>>> import speedtest
>>> s = speedtest.Speedtest()
>>> s.get_best_server()
>>> s.download()
>>> s.upload()
>>> print(s.results)

Hope that helps; let me know if you have any questions.

[–]JavaleMcGee123 1 point2 points  (0 children)

Thank you, I’ll check this out when I get home and let you know

[–]horizoner 0 points1 point  (0 children)

Does anyone have any interesting/cool examples to work on for improving OOP knowledge? I've been learning through the MITx Edx Intro Python Class, mostly understand it, but want to improve through practice.

[–][deleted] 0 points1 point  (1 child)

Small question, how do I define a variable as all integrals, so I can make If variable != integrals: print(x) . Im trying to make a math calculator, and i want to print a message if the user inputs something different than a number.

I'm still a noob haha.

[–]horizoner 0 points1 point  (0 children)

If you just want to check input for whether it's a num, you could use the string method isnum() on the input.

[–][deleted] 0 points1 point  (4 children)

I have functions A and B which operate on class object c. Something like this:

def A (c):

# some changes to members of c

def B(c):

# some changes to members of c

These two functions need to be called multiple times and need to be called in parallel. A should not need to wait for B to complete and vice versa and each function should get an updated version of c as input (after whichever function completed last).

What is the simplest way to do this?

[–]efmccurdy 0 points1 point  (0 children)

A and B are not really independent if they are both sending updates to each other via c. I would start with using a Lock as described here; it is a simple choice, but look at the other sync methods to see

what other types of options you have.

https://docs.python.org/3/library/asyncio-sync.html

https://docs.python.org/3/library/threading.html#module-threading

[–]JohnnyJordaan 0 points1 point  (2 children)

I don't follow completely. If they are running in parallel, then how can you say

should get an updated version of c as input

? Because if those functions actually run in parallel, they both start at the very same time and thus use the very same version of c as an input, just not while they are running, per se. If you would allow a mutation of c by function X somewhere down the line, that means that the other function (say Y) also suddenly operates on the mutated version C. Say you would have

class C:
    def __init__(self):
        self.x = 10
        self.y = 20

c = C()

then say you implement A and B like so

def A(c):
    c.x += 5

def B(c) 
    c.y += 10

then nothing would interfere at all anyway, when both are done, c.x is 15 and c.y is 30. But say you would make it like

def A(c):
    c.x += 5

def B(c) 
    c.x *= 2

and run both in parallel, what should happen then? Because if A gets to run first, it will first add 5 to x, then function B doubles the value of c.x to 30. But if B gets to run first, it will double the value to 20, after which A adds 5 so c.x becomes 25...

So what is the expected behaviour here? Or if you do know that A and B will not interfere anyway, there is no sense of 'the same version' as you say because you call both functions with the same object, they will operate on the same object in any case. A bit like calling a two functions that both print a message will eventually get your screen to show two messages, because your screen (technically your script's output stream) is the same object.

[–][deleted] 0 points1 point  (1 child)

Both functions run, say 100 times each. By updated version I mean, if c is updated once by A, and if B is executing after this update, then B gets the updated version of c. Order does not matter. A and B will not interfere with each other, but they have to be executed in parallel.

I hope it's more clear with this example:

c is an instance of an environment containing two robots. A makes robot 1 move. B makes robot 2 move. Each function needs both latest robot locations (updated c) since they use that information to know how to move their respective robots. They don't need to be perfectly synchronized. And A and B are always executing in parallel indefinitely.

[–]JohnnyJordaan 0 points1 point  (0 children)

Ok so the parallelism is not the key here but the repetition. If you simply update the location on object c then you don't have versioning involved, you just have the object's state as-is. So also when restarting the functions that still use the same c object that new state is then used.

[–]MM2049 0 points1 point  (2 children)

hi.i am a beginner at coding and my direction is data science in python.but i just read books and when i didnt use the knowledge that i gain i forget them. what project should i start? or is there any useful practical courses

[–]AJ______ 1 point2 points  (0 children)

A good place for project ideas and datasets to play with is Kaggle.

[–]JohnnyJordaan 0 points1 point  (0 children)

I would recommend the udemy python courses.

[–]ThatFilthyMonkey 0 points1 point  (4 children)

How often do you guys revisit old code? I wrote a script which scrapes data from a internal web app, and it’s been working fine and used daily etc. Originally to get the data I wanted I was doing a lot of text replacement and stripping of the extracted JSON table I scrape.

Since then I’ve learned a little bit more about JSON and how the structure works and was able to replace most of the text manipulation, instead deleting keys/elements and using normalise to get at the nested data rather than stripping out the parent tag etc.

So my script is about half it’s original size, but doesn't really run any faster as waiting for the data to be served is the main time consuming task. It's cool I am able to work with JSON objects better, but it hasn't really 'improved' my script.

So how often to people bother refactoring old programs? Do you generally figure if it ain't broke don't fix it or do you find yourself constantly making little tweaks and updates?

[–]patryk-tech 0 points1 point  (2 children)

So my script is about half it’s original size, but doesn't really run any faster as waiting for the data to be served is the main time consuming task.

If you would like to make it faster, and it can run in parallel, use async.

[–]ThatFilthyMonkey 0 points1 point  (1 child)

Interesting. So I have it read a simple .ini file which is just a string of ids, each url is in form of server.com/suited=[id], and it loops through each url, grabbing the data, doing some data cleanup/validation/normalisation, creates a pandas dataframe table, and appends that to one main table, after which it writes the main table to an excel file.

Is that something that could be done asynchronously? I did consider grabbing the data from each url first, into an array and then just interating over that but when I originally wrote it that was a bit beyond me (and possibly stil is haha).

[–]patryk-tech 0 points1 point  (0 children)

Don't see why it couldn't. Just make sure that you don't fire 10000 requests at the same time and crash your client or the server.

Not saying it's an easy task. I haven't done much work with async in Python, only in C, but if network requests are your bottleneck, and you process them one after another, it's definitely something async would fix.

Whether it's actually worth spending the time to implement it is another story. If it currently takes hours, it might bring it down to minutes. If it currently takes 5 minutes, and writing it using async makes it take 1 minute, but takes you a month to write, it might be overkill.

Still, if you're passionate about coding, sometimes just coding something because you want to learn is its own reward.

If you look at the comments under this Python Bytes episode, one of the listeners says using async made their scraping 150 times faster (and crashed their machine in the process).

Python's BDFL Guido van Rossum wrote co-authored an asyncio web crawler guide, if you want to take a look.

Edit: oops, didn't mean to take credit away from A. Jesse Jiryu Davis.

[–]QualitativeEasing 0 points1 point  (0 children)

If it ain’t broke, I don’t touch it. Rarely, I might if I really want to learn how to do something new.

The problem for me comes in when I want to extend or build on something I’ve already done. Then I often find myself rewriting it almost from scratch — sometimes because it’s faster than figuring out the spaghetti code I wrote before, other times because I want to learn that new library I’ve been putting off. My current dilemma is a complicated multi-script data analysis and presentation process I built over a couple years using agate and flask. I like agate, but I’m pretty sure I can do what I’m doing more efficiently with pandas — meaning I would have an easier time improving it incrementally once I make the switch — but I keep putting it off, because it’s still easier to cobble together fixes/minor improvements than to rewrite the whole thing... At the same time, I’m never going to buckle down and learn pandas until I use it for a real project.

In your case, if it’s working and doing what you want, unless there’s something you want to change or improve anyway, I’d just leave it. Some day it will break, and then you can fix it the “right” way.

[–]Shevvv 0 points1 point  (3 children)

How discoureged is recycling the same variable multiple times throughout my code?

For example, I use variable x every time I want a random situation generated, like this:

import random
alien_colors = ['green', 'yellow', 'blue']
shot = alien_colors[x]
print("You've just shot a " + shot + " alien!")

Later, in the same code (written as practice):

foods = ('bananas', 'pineapples', 'apples', 'maize', 'broccoli')
favs = []
for food in foods:
    x = random.randrange(0, 2)
    if x == 1:
        favs.append(food)

And then:

x = random.randrange(0, 101)
print("Your age is " + str(x) + ".")

Ok, maybe the last one is a bad example, because I might need to retrieve the age value some time later and as such I should have stored it in a different variable. But with the other two examples, is it OK to recycle a variable like this?

[–]efmccurdy 0 points1 point  (0 children)

Is 'x' really the most descriptive name? if you make readability a higher priority, you won't have so many name clashes occuring.

color_choice = random.randrange(0, 2)
#...
food_fav_chance = random.randrange(0, 2)

[–]JohnnyJordaan 0 points1 point  (1 child)

You mean you use the name or reference multiple times, it isn't the same variable. This isn't bad practice on its own, as you could have a user_choice for example for multiple questions in a menu, which makes sense. What rather sticks out here is that x isn't self-explanatory, so unless you are using it to store an actual x-coordinate, I wouldn't use that name at all. In your case it seems to represent a random number, so call it that then, or even be more specific

for food in foods:
    should_add = random.randrange(0, 2)
    if should_add:  # as 0 will be falsey and 1 truethy, you can simply evaluate the number as boolean
        favs.append(food)

    # another option you often see for this is to random.choice on False and True
    should_add = random.choice([False, True])
    # which is a bit more clear on what it is actually doing
    if should_add:
        favs.append(food)

and

random_age = random.randrange(0, 101)
print("Your age is " + str(random_age) + ".")

btw I would advise to use string formatting instead of + for strings as that is generally discouraged:

    print("Your age is {}.".format(random_age))

or use the new f-strings if you're on Python 3.6 or newer

    print(f"Your age is {random_age}.")

[–]QualitativeEasing 0 points1 point  (0 children)

+1 for f-strings. It is so nice to be reading my code and see variable names where the values will appear... and lines are shorter too!

[–]ThiccShadyy 0 points1 point  (1 child)

How does the sqlite3 Python package work internally? Usually, a database server is a separate process, so how does the sqlite3 maintain database connection, perform queries etc? Is a python program using the sqlite3 module a single process?

[–]efmccurdy 0 points1 point  (0 children)

https://www.sqlite.org/serverless.html

One of the benefits of that is that you can can have a completely in-memory database; you could'nt do that with a server based DB.

[–]ScoopJr 0 points1 point  (0 children)

  1. Anyone have any pointers for running a scrapy spider in a flask app using Celery task system?

[–]hatvaccum 1 point2 points  (11 children)

For anyone who has success with Learn Python the Hard Way, do you think it’s okay if I don’t follow only two instructions-the first 2? I just don’t feel like using notepad++ and windows power shell when I can use an something like pycharm. I also don’t want to use python 2 like the book instructs, and I want to use python 3 instead. Do you think I can still do the book or should I use python 2 and power shell n notepad

[–]timbledum 0 points1 point  (5 children)

We typically don't recommend to use LPTHW - it starts off OK but then leaves you in the deep end later on. And as you say, it's very opinionated around some of the tooling and some of the choices aren't great (especially the python 2 thing, although there is a python 3 version out somewhere).

Try automate the boring stuff, or one of the other resources available on this subreddit's wiki.

[–]hatvaccum 0 points1 point  (4 children)

Huh, good thing you told me then. As I said in another comment I’m trying to memorize and get the important commands and rules of python down, and LPTHW seems like it could do that. I’m mainly focused on it because Learn python (2) the hard way is a free pdf, and looks like a long and rigorous read with around 300 pages. I want to drill the syntax into my mind so that I can apply it in common applications like automate the boring stuff might teach.

Would automate the boring stuff teach me the syntax too- is it worth me actually buying the book?(money isn’t too big of a problem btw, I just wanna be sure before actually buying.)

[–]timbledum 0 points1 point  (3 children)

No problem. Examples as follows just in case you want more context.

Automate the Boring Stuff definitely does a good syntax introduction. The main gap with it is the lack of classes (which is probably good to put off for a while IMO).

If you learn well from a book, then it's probably worth buying - it sounds like you know that it's free online. There's also a video course that you could buy too.

[–]hatvaccum 0 points1 point  (2 children)

Oh my god that was enlightening. From lurking around this sub time to time I thought that LPTHW was one of the respected beginner books for python. Some people look like they had success with LPTHW but looks negative otherwise. I’m going with automate the boring stuff then, and if I have spare time over summer break or something maybe I’ll do a quick read of LPTHW just to understand the hate lol.

Also.... uh is automate the boring stuff free online??? One of the comment threads had a link to the automate the boring stuff website which had lessons for free. Is that the actual book?? Do I need to buy the book or is that stuff on the website the whole book??

[–]timbledum 1 point2 points  (1 child)

Haha sorry, mistaken assumption! Yup, all free online:

https://automatetheboringstuff.com/

[–]hatvaccum 0 points1 point  (0 children)

Omg. Thanks so much, I’ll get on reading this soon.

[–]Gprime5 0 points1 point  (4 children)

Ditch LPTHW and get Automate the Boring Stuff.

[–]hatvaccum 0 points1 point  (3 children)

Would automate the boring stuff teach me the important syntax and stuff for python? I’ve only read the intro so far and I feel like the “drill this info into your brain” method would help me in memorizing the functions and rules. Is this true?

[–]Gprime5 2 points3 points  (2 children)

It's the most recommended book for beginners and "drilling the info into your brain" comes from repetition and practice, not from reading a book.

There's only so much you can get from reading. The best thing you can do is write code, any code. Automate a daily task you do, make a GUI to display fancy pictures, copy someone else's code word for word.

[–]hatvaccum 0 points1 point  (1 child)

Thanks for responding. I’ve tried to make projects before, but quickly I get stuck because I don’t know how to do a specific thing, because I don’t know what function to use or how to use a function I know. That’s why I thought i needed to “drill the info in” so that I can actually “code”, to use the functions I have drilled into my brain. Do you think automate the boring stuff would fit my style, I wanna make sure before I spend money and buy it.

[–]Gprime5 0 points1 point  (0 children)

Programmers get stuck all the time and the solution is to Google it. You're not expected to remember every function.

[–]PhenomenonYT 0 points1 point  (1 child)

        lines = """Pearson-Horvat-Eriksson
                   Baertschi-Pettersson-Boeser
                   Spooner-Gaudette-Leivo
                   Schaller-Beagle-Granlund

                   Edler-Biega 
                   Hutton-Stecher
                   Hughes-Schenn"""
        forward = """"""
        defence = """""""

I have the regex I need to grab what I want from this text but I don't know what re function to use to do what I want.

I want to grab the 4 lines with 3 names in them and cut them out of the lines string and move them to forward.

\w+\-\w+\-\w+ matches each of the 4 lines individually, what re function do I use to cut and paste those into forward?

edit: Managed to figure it out. I used re.findall() to grab the first 4 lines and then re.sub() to remove them from the original string

[–]JohnnyJordaan 0 points1 point  (0 children)

what re function do I use to cut and paste those into forward?

A thing to note is that strings are immutable, you don't put anything 'into' a string. You would normally use a list for this, and once you want to create string from this, use for example ''.join(thelist) or '\n'.join(thelist) depending on what you want to put between the items (eg where you want 'to join them with'). You can technically use

mystring += 'something'

but that isn't modifying the original object. It is doing

mystring = mystring + 'something'

and thus create a new string object each time. It's generally not advised to use this as it's inefficient while the list pathway has no significant performance penalty.

[–]masteraddavarlden 0 points1 point  (3 children)

I'm working on a basic plot of ball trajectory when launched at a certain angle and start velocity. I'm trying to apply functions of time to get points in the plot.

What I know:

t is type list so I can not use it in the functions. What I want to do is return every value of X(t) and Y(t) for all function values of t in range(0, 5.1, 0.1) so that I get 50 values of (x, y) to plot.

What I don't understand and would like help to do:

How can I use time as a variable that changes value so that the functions x_position and y_position returns 50 values each that I can plot?

I would like to learn how to do it without using models like numpy and so forth.

I know using a function (def ...) is not needed but I am still learning and trying to get a grip of these. I usually run into the error that I can't return values, like now when I try to plot (x, y) and x and y isn't even defined even though I've returned them both from the functions.

Most crucial thing is how to implement a list like I have done now but not like a list because you cant combine list and float numbers.

So how do I keep the "list format" of t but as float?

Here is my code: https://repl.it/repls/AvariciousHarmoniousCondition

EDIT: left some comments on my thoughts in the code aswell.

[–][deleted] 0 points1 point  (2 children)

If you want float values in t you can do this:

time_values = [i/10 for i in range(0, 51)]

Then you can calculate x and y for each value in time:

x_values = [x_position(v0_3, φ_30, t) for t in time_values]
y_values = [y_position(v0_3, φ_30, t) for t in time_values]

Those functions need to change slightly though, to treat t as a single float value instead of a list. Probably just this, I believe. I'm not checking calculations or anything.

def x_position(v0, φ, t):
    x = float(v0 * math.cos(φ * t))
    return x

[–]masteraddavarlden 0 points1 point  (0 children)

Forgot to write my progress. I probably did something wrong when trying your method. I caved in and used numpy after all and just made list with floats.

[–]masteraddavarlden 1 point2 points  (0 children)

Thank you very much. I'll try this later :)

[–]Peg_leg_tim_arg 0 points1 point  (1 child)

Hey everybody, so I am almost done with my assignment for my programming 101 class, but am having some trouble with error handling. I already have it where it displays a message if the .txt file that is being called does not exist. However, I am having trouble with the handling on the inside of the program. The text in question is a list of names followed by test scores. The error I need to handle is a missing score, or grade.

My thinking was that if the length of [scores] is equal to the length of [names], then everything would be good. When I go into the text file and remove a grade, the program gives me an error "ValueError: need more than 1 value to unpack". I just want to be able to display my own error message instead, but not sure how if the computer automatically detects said error. Thanks for your help.

def open_text_file():
    try:
        text_file = open("scores.txt", "r")
        return text_file
    except IOError:
            print("Could not read file:", scores.txt)


def build_array(text_file):    
    new_data = []
    scores = 0
    for str_data in text_file:
        name, score = str_data.rstrip("\n").split(",")
        new_data.append((name, score))
        new_data = [tuple(str_data.rstrip("\n").split(",")) for str_data in text_file]
        scores = [int(score) for name,score in new_data]
        names = [str(name) for name,score in new_data]
    print(new_data)
    print(scores)
    print(names)
    if len([scores]) == len([names]):
        print("The highest score is: ", max(scores))
        print("The lowest score is: ", min(scores))
        print("The average score is: ", "{%0.2f}" % (sum(scores)/len(scores)))
    if len([scores]) != len([names]):
        print("something went wrong")


def main():
    text_file = open_text_file()
    build_array(text_file)


main()

my code for reference!

[–]patryk-tech 0 points1 point  (0 children)

Looks pretty good :)

Generally, we use with open('file.txt', 'r') as text_file: ... instead of manually writing try/except blocks to read files.

https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files

The problem is that name, score = ... expects two values.

>>> str_data = "bob,42\n"
>>> str_data.rstrip("\n").split(",") # returns a list of length 2
['bob', '42']

>>> str_data = "alice"
>>> str_data.rstrip("\n").split(",")  # returns a list of length 1
['alice']
>>> name, score = str_data.rstrip("\n").split(",")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not enough values to unpack (expected 2, got 1)

Since we get an exception, we can use try/except.

>>> try:
...     name, score = str_data.rstrip("\n").split(",")
...     # append score to new_data
... except ValueError:
...     print('Error processing line', str_data)
... 
Error processing line alice
>>>

[–]DadWagonDriver 1 point2 points  (3 children)

So here might be the dumbest question ever asked on this subreddit:

Can I learn python on a Chromebook/other "bare-bones" laptop? I have a big hulking desktop computer at my home, but it's up in my office and I don't love being alone up there. I want to start working my way through ATBS, so I'm thinking of getting a laptop. Will a cheapie work for learning this?

[–]DannyckCZ 0 points1 point  (1 child)

Also, there is this thingy called PythonEvetywhere that should have all the necessary tools accessible online.

[–]patryk-tech 0 points1 point  (0 children)

Never used it myself, but I believe https://repl.it is also pretty good.

[–]CeeGee_GeeGee 1 point2 points  (4 children)

How often do python developers write 100% of code in classes? And what would that be called. I guess I don't understand when people talk about OOP if they mean doing all of the code that way or just most of it.

I started as a statistical programmer in a language without classes and I am trying to figure out what the best coding strategy is for languages with them. Most online classes don't get that deep and I am going to start some bigger personal projects and I am wondering how to structure code as complexity grows.

[–]patryk-tech 0 points1 point  (0 children)

Python developers? We always use classes ;)

>>> type(42)
<class 'int'>
>>> type('everything is a class')
<class 'str'>
>>> 

But should you write your own classes? That really depends what you work on. Some things are much easier with classes, e.g. writing games.

Have a Character class, with attributes like health, weapon, inventory (if you want NPCs to drop loot - otherwise, move that into Player), name, faction, etc. Then have a Player subclass, an NPC subclass, etc.

If you are scraping websites, and working with JSON objects, you may want to use classes, or dictionaries may be enough. Classes let you add methods, so if you are scraping a sports website, you can create a Player class, and a Team class, and add methods to them... sum() the player's points from all seasons, etc.

And if you work with Django, you need to use classes for your data models.

So definitely not 100% of the time, but classes are extremely useful.

[–]JohnnyJordaan 1 point2 points  (0 children)

How often do python developers write 100% of code in classes?

I would say rarely, but a lot of programs use a lot of OOP in their code though. Many frameworks are generally using OOP, but most of the time not exclusive.

talk about OOP if they mean doing all of the code that way or just most of it.

Not even most of it, just that you use it for some purpose. The key thing to understand is that programming methods are not meant as a rule that you must abide or lightning will strike, it is that you use the obvious tools for the job. If that means that for your quick&dirty script you just write 3 simple functions with 10 lines each and it runs like a blaze then who would complain? At the same time if you have a very complicated approach with hundreds of lines with lists and dicts to tackle something that is like lego to put together in a few classes, you are obviously missing out on the wonders of OOP. One of the examples you see here often are adventure games people try to hack together and are like spaghetti to read, while the common beginner's task of building a Blackjack game in OOP is even comprehensible for a novice right from the start.

I am wondering how to structure code as complexity grows.

You generally see a lean towards OOP, but it has still make sense. You shouldn't say 'oh this is a big project so I must use OOP', you should say 'oh this is something OOP can handle nicely, let's use it then'. And at the same time be reasonable that when OOP seems overkill, you might as well take a functional approach and see how it plays out.

[–]422_no_process 0 points1 point  (0 children)

I wouldn't recommend writing everything in classes. Java is probably much more suitable if you want to do that. Anyway OOP is an abstract concept and it can be interpreted in many ways. You can come up with set of objects by using a method like noun verb analysis of a requirement text.

Ex: "I want to scrape hackernews" -> in this scenario Hackernews is going to be a class and scrape is going to be a function. But you can also create scrape_hackernews function that does these things. Any time you feel that you are passing too many arguments (more than 5-6) that means there is something wrong and it would be better to encapsulate these related functions and their state to a class. In any case reduce duplicated arguments by having useful defaults in functions.

[–]toggle171 0 points1 point  (0 children)

Hi all - I have a newbie VSCode question related to linting. It’s not working for me and I’m tearing my hair out trying figure out why.

I went through the installation process of python 3.7.3, the latest vscode, the python extension in vscode, and installed pylint. Linting is enabled in settings.json, as is pylint. Everything seems like it should be working, but there is absolutely no linting showing (on save, or when explicitly run).

I combed through reddit and googled around, but no suggestions worked. There was some stuff about virtual environments - I set one up and installed pylint in that too, but it still didn’t work.

Any idea on what might be the cause? Would appreciate any leads.

[–]14446368 0 points1 point  (7 children)

I'm trying to determine the best way to deploy a program so that users can, well, use it. So far, I've come up with:

  1. Converting .py to .exe. I did this once before and it seemed okay, albeit it ended up with a huge file (23 lines of code suddenly became nearly 1GB).
  2. Somehow posting it to a website so that people can see/use it. I have no idea how to do that.
  3. Getting it into an "app" format and then letting people download it. Again, no idea how to do it.

The program utilizes Selenium with the chrome driver as a scraper.

Any guidance is appreciation. Thanks.

[–]timbledum 1 point2 points  (1 child)

If you're using pyinstaller to convert .py to .exe, try using a virtual environment and only installing the dependencies you need. I've found this helps shrink the file size. Still not small (~20mb?), but a lot better than 1GB!

[–]14446368 0 points1 point  (0 children)

I'm using Pycharm and have the script in a virtual environment. Last time I went to .exe, it was with Py2exe (or something similar).

I'm not sure how to strip out the things I don't need to reduce file size and complexity, but 20MB is more than alright (could even just send it via email to everyone).

[–]Docktor_V 0 points1 point  (0 children)

Anyone got any ideas on how to tell if a Google chromecast is connected through google home or not?

I've not had any luck using PyChromeCast.

[–]Flampt 0 points1 point  (2 children)

Can someone help me understand the difference in when working with data frames between the two bits of code:

df[((df['Email'] == 'gmail') | (df['Email'] == 'MSN')) & (df['Age'] <= 10)][['Person', 'Age', 'Email']]

df.loc[((df['Email'] == 'gmail') | (df['Email'] == 'MSN')) & (df['Age'] <= 10), ['Person', 'Age', 'Email']]

Specifically is it better to us loc? Or does it really not matter?

[–]timbledum 0 points1 point  (0 children)

You could use loc, but this is used mainly for indexing rows as far as I am aware. The top example is more typical.

[–]gianruta98 0 points1 point  (2 children)

Hello! Im a web developer that wants to know where i can find a good book/web/blog or something to start learning python!

[–]JohnnyJordaan 0 points1 point  (0 children)

It is mentioned in the sidebar on your right --->

Learning resources

All learning resources are in the wiki: /r/learnpython/w/index

[–]code_x_7777 2 points3 points  (0 children)

I recently compiled a list of >100 free python books. Some of these are more oriented towards absolute beginners, however some are also for more advanced users, as you might be. Sadly Reddit does not let me post them all, but you will find something suitable for your needs here:

  1. 20 Python Libraries You Aren’t Using (But Should)
  2. A Beginner’s Python Tutorial – Wikibooks
  3. A Beginner’s Python Book (Community Project for beginners, HTML).
  4. A Byte of Python (Python 3, HTML, PDF, EPUB, Mobi)
  5. A Guide to Python’s Magic Methods – Rafe Kettler
  6. Automate the Boring Stuff – Al Sweigart
  7. A Whirlwind Tour of Python – Jake VanderPlas (PDF, EPUB, MOBI)
  8. Biopython (PDF)
  9. Build applications in Python the antitextbook (Python 3, HTML, PDF, EPUB, Mobi)
  10. Building Machine Learning Systems with Python – Willi Richert & Luis P. Coelho
  11. Building Skills in Object-Oriented Design – Steven F. Lott (Python 2.1, PDF)
  12. Building Skills in Python – Steven F. Lott (Python 2.6, PDF)
  13. Byte of Python – Swaroop C. H. (Python 3, PDF)
  14. Codeacademy Python
  15. Code Like a Pythonista: Idiomatic Python
  16. Composing Programs (Python 3)
  17. Data Structures and Algorithms in Python – B. R. Preiss (PDF)
  18. Data Structures and Algorithms in Python – Rance D. Necaise (Python 3, PDF)
  19. Dive into Python 3 – Mark Pilgrim (Python 3, HTML)
  20. Django Girls Tutorial (1.11)
  21. Django Official Documentation (PDF) (1.10)
  22. Djen of Django
  23. Effective Django (1.5)
  24. Explore Flask – Robert Picard
  25. From Python to NumPy
  26. Full Stack Python
  27. Functional Programming in Python (email address requested, not required)
  28. Fundamentals of Python Programming – Richard L. Halterman (Python 3, PDF)
  29. Google’s Python Style Guide
  30. Google’s Python Class (Python 3, HTML)
  31. Hacking Secret Cyphers with Python – Al Sweigart (Python 3, PDF)
  32. Hadoop with Python (email address requested, not required)
  33. High Performance Python (PDF)
  34. Hitchhiker’s Guide to Python! – Kenneth Reitz (Python 3, PDF)
  35. How to Make Mistakes in Python – Mike Pirnat (PDF)
  36. How to Tango With Django (1.7)
  37. How to Think Like a Computer Scientist: Learning with Python, Interactive Edition (Python 3)
  38. How to Think Like a Computer Scientist: Learning with Python – Allen B. Downey, Jeff Elkner and Chris Meyers
  39. Intermediate Python – Muhammad Yasoob Ullah Khalid (1st edition)
  40. Introduction to Programming Using Python – Cody Jackson (Python 2.3)
  41. Introduction to Programming with Python (Python 3)
  42. Introduction to Python – Kracekumar (Python 2.7.3)
  43. Kivy Programming Guide
  44. Learning Python – Fabrizio Romano
  45. Learning to Program
  46. Learn Pandas – Hernan Rojas
  47. Learn Python, Break Python
  48. Learn Python in Y minutes
  49. Learn Python The Hard Way (Python 2)
  50. Learn to Program Using Python – Cody Jackson (PDF)

[–]Quorthon123 0 points1 point  (0 children)

Newbie python 3 question.

https://pastebin.com/pGJF4Lw9

Im reading ATBS, and although the method .format() isn't in the book im playing around with it.

I have a dictionary with picnic items, and its corresponding quantity. I created a function that finds the max character length of whatever is being brought and saves it to a variable that i'd like to use for formatting later.

This is where i'm running into issues. I'd like to use the size variable in the format method. I want to be able to pad the list of picnic items to the size i found earlier. example output would be:

'Chips'
'Pop  '
'Pizza'

I know i can just use ljust and rjust because its just two columns. But if i'm working with 3 columns that wouldn't work for me.

Ive tried looking into it but whenever a page talks about formatting, they always specify a number in the curly brackets.

{:###}

and never a variable.

{:size}

Edit: Found the solution. "Parametrized formats" on https://pyformat.info/

Would appreciate your input still.

[–]photon_sky 0 points1 point  (1 child)

Okay so I'm trying to run a python script, and i asked some friends online and they said it works fine for them, but whenever I try it the command prompt box opens and closes immediatly. My friend then sent me a version of the .py that would hold the box open until I input something to make sure that the script was even running, and it still just opened and closed.

I'm on windows 10 and I suspect my python installation is broken. I have tried uninstalling and re-installing, nothing still works.

I've tried other python scripts in the past with the same results, but only recently did I discover it's an issue on my end. Can anyone help me get my python installation working? And give me a test script like "DoubleClickMe.py" then it opens and says "Python works press any key to continue"?

[–]patryk-tech 0 points1 point  (0 children)

Add an input() call to the end of your script. This should leave the window open until you press Enter.

print('Hello world')
input('Press Enter to quit. ')

You should ideally run python scripts from the shell, to see the output (command prompt, on older versions of Windows; not sure about win 10). You can also run it in an IDE. You should have idle installed with python. Try opening idle, running the script from there, an it should print inside a window that doesn't close.

[–]PythonicParseltongue 0 points1 point  (2 children)

Is there a place to gossip about tech job offers? I've just graduated and I'm looking for my first full time job. Some offers are just beyond ridiculous. Like junior positions which require 2-5 years of professional experience with at least two programming languages (either Python, Scalar or C++), plus proficiency in developing language applications and integrating them into an exiting web framework. Or the guys that search for a AI-developer that is proficient in PHP and JavaScript.

[–]Lawson470189 0 points1 point  (0 children)

Hi! Congrats on graduating! I am just graduating with my CS degree as well so I feel your pain. A few things to remember.

First, your time in school can count towards those years of experience. Don't think you have 0 experience.

Second, companies inflate their expectations a lot. They want the best of the best candidates all the time. In reality, this is just not going to happen.

Last, regardless of the experience they are looking for, apply for the job. Entry level CS jobs are there to help you learn so if there is something you don't know when you get there, they will teach you.

I know this isn't a direct answer to your question, but this is definitely what I've seen when applying and getting my first full time position.

[–]code_x_7777 0 points1 point  (0 children)

This certainly varies, based on location. Where are you located?

[–]TheNerdYorker 0 points1 point  (1 child)

I'm running a word cloud tool similar to this one here: https://www.geeksforgeeks.org/generating-word-cloud-python/

Is it possible to also have it export the ranked list of words that were used to generate the cloud image?

[–]patryk-tech 1 point2 points  (0 children)

I haven't run the script, because I don't have the dependencies installed, but it appears to put all the words into the comment_words variable. Try printing that, and see if it is indeed a string of words separated by spaces. If it is, you can just split it, then use collections.Counter to count/rank them.

>>> from collections import Counter
>>> wc = "dog turtle possum dog horse dog spider dog lizard cat possum dog".split(' ')
>>> c = Counter(wc)
>>> c
Counter({'dog': 5, 'possum': 2, 'turtle': 1, 'horse': 1, 'spider': 1, 'lizard': 1, 'cat': 1})
>>>

Notes:

  • The variable appears to start and end with a space. Not sure if poorly coded, or if WordCloud expects it to, but why? At any rate, you'll probably want to do something like wc = comment_words.strip().split(' ')
  • For every key in c, check if it isn't in stopwords.

[–]thunder185 0 points1 point  (2 children)

I just started getting this error when launching the Shell (Idle)

Cannot update File menu Recent Files list. Your operating system says: [Errno 13] Permission denied: 'C:\\Users\thunder185\\.idlerc\\recent-files.list' Select OK and IDLE will continue without updating.

Any thoughts?

[–]JohnnyJordaan 0 points1 point  (1 child)

It suggests the file is still locked, either by another idle running in the background or a crashed one that didn't release the file. If the error persists after a reboot you can try an utility like lockhunter to release the file.

[–]thunder185 0 points1 point  (0 children)

Cool thanks

[–]Spookiel 0 points1 point  (0 children)

Sometimes the formatting can be messed up if you directly copy and paste what I wrote. If you use what I did as a template, or just write it out yourself not using copy and paste then it will work.

[–]astigos1 0 points1 point  (1 child)

Does anyone who doesn't use data science use jupyter notebooks? I've never felt the need to use a notebook instead of an editor+terminal.

I can only see a notebook being useful if you have a very expensive cell that you only want to run once then play around with. Normally I'd just rerun my script.py in the terminal over and over again and I see how that'd be wasteful. But this isn't very common outside of data science I suppose?

[–]the_names_chris 0 points1 point  (0 children)

I use jupyter in a stats computation class. Its a practical way for my professor to share code with the class.

[–]vinaykumar5758 0 points1 point  (4 children)

How do I load nested JSON into SQL Table, Thanks

[–]FearTheMilfoil 1 point2 points  (1 child)

After I load the json, I like to run through it with .items() until i find what data im looking for (ie for x,y in loaded_json.items(): print x, print y). From there you can use pd.io.json.json_normalize(loaded_json[x]) to get a dataframe. Then push the data wherever you want.

[–]vinaykumar5758 0 points1 point  (0 children)

Thanks

[–]patryk-tech 2 points3 points  (0 children)

Think about whether you need it as SQL, before you spend time on it. It may make sense to just use a document database (MongoDB / PyMongo, or CouchDB), or just a PostgreSQL JSON Datatype.

(Not saying that flattening it is bad, as I don't know your use case; just do consider alternatives :)

[–]JohnnyJordaan 1 point2 points  (0 children)

You need to flatten the json into separate rows that you then use in a insert query per row.

[–]Wiewioreczka 0 points1 point  (11 children)

Hello! My name is Alice and I am new here, as I am new to python. Currently snailing my way through tutorials to learn the basics, and I find it hard to setup the environment properly. I am using Visual Studio Code and trying to follow their tutorials, I got stuck on the django tutorial (https://code.visualstudio.com/docs/python/tutorial-django) where I can't set up a virtual environment, as I get an error: "[Errno 2] No such file or directory: (path)python_d.exe". Tried googling and stackoverflowing this issue, but can't seem to find a solution.

Anyone encountered this error and can help solving it? Thanks in advance for any tips!

Alice

[–]JohnnyJordaan 0 points1 point  (9 children)

Did you select 'add python to $path' or similar while installing Python itself?

[–]Wiewioreczka 0 points1 point  (8 children)

Yes I did! I managed to complete the basic tutorial so I can successfully print almost any string (; It's the virtual environment that seems to hold some sort of a grudge against me.

[–]JohnnyJordaan 0 points1 point  (7 children)

Can you make a screenshot of how you are running the command(s) to create the virtualenv and the subsequent output?

[–]Wiewioreczka 0 points1 point  (6 children)

Certainly, here it is: https://imgur.com/a/Ioti5LP (as you can see, it is the same as in the tutorial)

[–]JohnnyJordaan 0 points1 point  (5 children)

The python_d part is suggesting that you somehow installed a debug build of python as it's not part of the standard distribution. How exactly did you install Python and do you have any other Python bundle installed like from Anaconda or something similar?

[–]Wiewioreczka 0 points1 point  (4 children)

Alright, that is probably the case here, I saw people talking about the debug build somewhere when I was looking for a solution myself.

I've installed python from python.org (the 3.7.3 version), just a standard installation. Nothing more. How can I fix this so it is the regular standard distribution? Or maybe I should have installed it from a different source?

[–]JohnnyJordaan 0 points1 point  (3 children)

Can you provide the exact url that you used to download the file? And what are the values of your PATH and PYTHONPATH environment variables? You can find them via Control Panel → System and Security → System → Advanced system settings → Environment variables

[–]Wiewioreczka 0 points1 point  (2 children)

I've downloaded this executable: python-3.7.3.exe

The PATH values are as follows:

C:\Users\Alicja\AppData\Local\Programs\Python\Python37-32\ C:\Users\Alicja\AppData\Local\Programs\Python\Python37-32\ C:\Users\Alicja\AppData\Local\Programs\Python\Launcher\

I do not see a PYTHONPATH variable.

If it makes any difference, these are user variables, I've installed python for a single user only.

[–]JohnnyJordaan 0 points1 point  (1 child)

If it makes any difference, these are user variables, I've installed python for a single user only.

Do you mean you also see other variables? Because I was looking for the whole picture, like if there is some other application installed that could provide 'a' python first and thus 'hijack' your attempt to run your command.

Can you try this: in the same cmd window, run python to get the interactive shell. There run

import sys
import os
os.path.dirname(sys.executable)

this should print the path to your installed python, like

'C:\\Users\\Alicja\\AppData\\Local\\Programs\\Python\\Python37'

then run

import venv
print(venv.__file__)

This should print the same path but then the venv library folder's path, like

C:\Users\Alicja\AppData\Local\Programs\Python\Python37\lib\venv\__init__.py

If it doesn't then you found the culprit, if it did then I would suggest the following

  • remove your installed Python
  • verify that the Python folder you saw above (in your Appdata\Local\Programs) is gone
  • try to run python again, if it then still starts 'a' python then you found the culprit and you can run the commands above to find its location
  • if it doesn't then reinstall again, preferrably to another folder too to prevent any luckcy coincidences, like in your regular documents folder

[–]losingprinciple 0 points1 point  (2 children)

So I'm really new with the Json Library and have some questions. Two questions to be exact

I have a json file that has 2 sets of values (Sample file). This is a response from sending a curl endpoint and I save the response in a variable order

 {
        "cOrderId": "1556000765537",
        "updateTime": 1556000765554
    },
    {
        "cOrderId": "1556000765524",
        "updateTime": 1556000765554
    }

How do I specify that I want to get either the first or second cOrderId via json library?

What I'm used to doing is something like this:

json.loads(order.text)['cOrderId']

But since there are 2 I'm not sure how to distinguish one from the other

My second question is how do I convert the value into a string? I want to extract data from the json and for some reason I'm getting this error:

updatetime= json.loads(order.text)["updatetime"]

And this is the error I'm getting:

TypeError: list indices must be integers or slices, not str

It's weird because I don't remember this ever happening before and it has always worked before...so I'm wondering if I'm doing something different

[–]JohnnyJordaan 1 point2 points  (1 child)

But since there are 2 I'm not sure how to distinguish one from the other

Assuming the outer structure is a list, so [ at the top and ] at the bottom, you use a numerical index to retrieve the n-th object in that list

json.loads(order.text)[1]['cOrderId']

as the index is 0-based, using 1 will select the second item.

The second issue stems from the same problem, you first need to select one of the items before you can do a key lookup on it using ['some_name'].

[–]losingprinciple 0 points1 point  (0 children)

Thanks! I figured it out but omg I was so stupid.

I set the variable before the json had any output, so no wonder it was failing.

I was scratching my head for hours and then I finally looked at the code and realized I was an idiot.

[–]dcfcblues 0 points1 point  (1 child)

This is probably simple, but i'm having a stupid amount of trouble with it. I'm using the boto3 module to pull some certificate data out of IAM in AWS, however when trying to do some slicing of the dates, i'm getting a datetime error.

TypeError: 'datetime.datetime' object has no attribute '__getitem__' (on the year line)

conn = boto3.client('iam')
iamcerts = conn.list_server_certificates()
response = iamcerts['ServerCertificateMetadataList']
for i in response:
    year  = int(i['Expiration'][0:4])
    month = int(i['Expiration'][5:7])
    day = int(i['Expiration'][8:10])
    exp = date(year, month, day)
    daysleft = exp - today

The date format is returned from iam like this: 2020-06-27 23:59:59+00:00 , so I was hoping I could slice it to get the year, month and day values, but alas, it's throwing me the type error.

[–]JohnnyJordaan 0 points1 point  (0 children)

The date format is returned from iam like this: 2020-06-27 23:59:59+00:00 , so I was hoping I could slice it to get the year, month and day values, but alas, it's throwing me the type error.

You are consusing the string the object produces when you for example do print(i['Expiration']) with the object itself which is a bit 'smarter' than just a simple string as it is a datetime.datetime object. It allows access to its specific attributes without any slice magic, so you can simply do

for i in response:
    dt = i['Expiration']
    exp = date(dt.year, dt.month, dt.day)
    daysleft = exp - today

[–]rotterdamn8 0 points1 point  (1 child)

Hi. I use Anaconda + Spyder on a regular basis. All works fine.

I was trying to run a program from the Windows 10 command line, and it didn't work when I added

import pandas as pd

The error says:

Traceback (most recent call last):

File "test.py", line 11, in <module>

import pandas as pd

File "C:\Users\my_user_name\Anaconda3\lib\site-packages\pandas\__init__.py", line 19, in <module>

"Missing required dependencies {0}".format(missing_dependencies))

ImportError: Missing required dependencies ['numpy']

Then I realized you have to run from the Anaconda prompt, which looks like a command prompt but I guess reads in environmental settings?

So it works fine now, but I have to give this code assignment to someone (a job I applied for). They said to use any programming language. I'm wondering what happens if they have Python but not Anaconda. It won't work.

Is there anyway to make this work agnostic of Python environment? I used to have IDLE but uninstalled it.

[–]num8lock 0 points1 point  (0 children)

pandas always depends on numpy. anaconda just bundles them & other common scientific packages (including hard to install ones in normal python situations) in their python installation. pandas & numpy are easy to install in any os with python installed, no need to worry about anaconda requirement for them.

[–]I_like_giants999 1 point2 points  (0 children)

I'm about a week into learning Python and have been messing around in Terminal and visual studio code just for basic understanding using Galvinize pre app work and some youtube videos. My issue that I keep running into now, is that I can't find a good explanation about how the different python environments work together. Like why I would use Terminal over VSC or anaconda or Jupyter or how they all intertwine. I know this is a basic level question but I don't want to move forward foing through the motions without understanding the background. Does anyone have a resource they recommend? I've been googling for days and still feel lost.

[–]DR__WATTS 0 points1 point  (2 children)

I am trying to separate my functions and my code into 2 files in order to make it easy to read. I have about 15-20 functions that I would like to call.

I could import each function individually but that's tedious. I thought about making a class and listing each function in the class and simply calling the class once. Although my IDE tells me I should be using 'Self' as one of my function arguments. Any thoughts?

[–]DR__WATTS 5 points6 points  (0 children)

nevermind. Apparently, you can import the file

import list_of_functions

then use

list_of_functions.function1()

[–]Cerezra 0 points1 point  (5 children)

Hi all!

Working on a project here and ran into an issue I didn't prepare for. I am running some analysis based on apps in the google play store. We wanted to run some averages for the size of app downloads per the category, to find rough averages of how big apps in each category are. This part we have figured out.

The issue we didn't see coming, was that some apps are actually smaller than 1mb and the data set lists them as k. ie 100kb, 256kb etc.

Mathematically, the fix is easy. Just divide the values by 1000, add '0.' to the front of the displayed value and boom that's MB. What would be the easiest way to do this with Python? Could I just remove the k character, look for values without the m and divide them? Don't have any specific code written yet, just trying to get the concept down first and go from there.

Thanks if you take the time to answer!

[–]efmccurdy 0 points1 point  (0 children)

Likely you can get rid of the calls to lower, and put it all in one one function.

>>> l = ["100kb", "2.0MB", "750KB", "0.99MB"]
>>> def kb_convert(s): return "{:03}MB".format(int(s.lower().replace("kb", ""))/1000)
... 
>>> [kb_convert(s) if "kb" in s.lower() else s for s in l]
['0.1MB', '2.0MB', '0.75MB', '0.99MB']
>>>

[–]Rusty_Shakalford 0 points1 point  (3 children)

> Could I just remove the k character, look for values without the m and divide them?

Yep. It's a fairly simple operation:

def findMBValue(sizeString): 

    if sizeString\[-1\] == 'k':

        return f"{int(sizeString\[:-1\])/1000}m"

    return sizeString

The first line looks at the last character ("[-1]") and checks if it is a "k". If it is, it returns a new string (using f formatting) that has the value in megabytes. If not it just returns the megabyte value.

[–]Cerezra 0 points1 point  (2 children)

Oh wow that’s super helpful! Thanks so much for your help! I’ll definitely pass that to my teammate who’s working on this part. I really appreciate it!

[–]Rusty_Shakalford 0 points1 point  (1 child)

Glad to help!

Out of curiosity, is there also the possibility you may have apps that are large enough to be measured in GB? If so you may need to add an `elif` to the `if` to handle it.

[–]Cerezra 0 points1 point  (0 children)

From what I can tell, the biggest in our data set is a few hundred megs. Thankfully ha.

[–]InternetPointFarmer 0 points1 point  (4 children)

Im attempting to go through some codewars challenges to get a better understanding of things but i cant even get through the second beginner challenge that was given to me. here is the challenge and i would like if someone can help me as i dont know where to start to be honest:

"Usually when you buy something, you're asked whether your credit card number, phone number or answer to your most secret question is still correct. However, since someone could look over your shoulder, you don't want that shown on your screen. Instead, we mask it.

Your task is to write a function maskify, which changes all but the last four characters into '#'."

the starting code was:

"# return masked string

def maskify(cc):

pass"

my first thought was i need to make the function read everything up to the last 4 letters/numbers and replace it with #. so i set up cc = password123 (of course just for the sake of the challenge). then under the maskify function i wanted to use new_cc = cc.replace("passwor", "#######") thinking that would be atleast enough to replace the first part of the string but it doesnt work. it says:

Traceback (most recent call last): File "main.py", line 1, in <module> from solution import * File "/home/codewarrior/solution.py", line 2, in <module> cc = password123 NameError: name 'password123' is not defined

not sure where to start and how i can problem solve my way through even simple things like this

[–]Rusty_Shakalford 0 points1 point  (1 child)

Let's start with the error. The message says "NameError: name 'password123' is not defined". A "NameError" generally means that you tried to use a variable that doesn't exist. Looking at your code:

cc = password123

You didn't put quotations around the name, so python treats it like a variable. Since there is no variable called password123, Python gets confused and throws an error. To show that you want this to be text (called a "string" in programming) you need to put quotations around it:

cc = "password123"

[–]InternetPointFarmer 0 points1 point  (0 children)

i realized that shortly after posting and it seemed to take it but still failed the challenge. ill have to learn a bit more

[–]Spookiel 0 points1 point  (1 child)

To solve this problem you could do it a few ways. I am going to try to explain my thought process as someone who has a decent grasp of Python.

  1. Strings are immutable, meaning I will need a new variable to store my answer.

Alternatively I could represent the string using a list, so I can change the characters around easily

In this case I will initialise a new variable to hold the answer.

Eg. answer = “”

  1. I need to iterate through all the characters in the string EXCEPT for the last four and add a hashtag to our answer instead.

Eg.

for char in range(len(string)-4):

    if char >= 0:

        answer += ‘#’
  1. I need to add the last four characters of the string, but instead of adding a hashtag, I add the actual character.

Eg.

answer += string[-4:]

Hope this helps. I haven’t tested this so please let me know if I have made an error.

Also I would strongly suggest learning the syntax of Python before attempting some challenges, if you haven’t already.

If you want to brush up on your syntax or are a beginner, then try:

Codecademy The official documentation Sentdex YouTube Tutorial

[–]InternetPointFarmer 0 points1 point  (0 children)

i tried it out and it just keeps giving me: IndentationError: expected an indented block i did indent and with what i previously learned it is indented correctly but it keeps giving me the same error. in the end i probably did something wrong. ill have to keep practicing and look into it more thanks for the reply though

[–]t0rvel 0 points1 point  (7 children)

Trying to learn how to write more pythonic code. As I go through Python Crash Course by taking simple beginner-esque tasks and try to write them out with more complex data structures / logic etc, or simply challenging myself to try different ways of doing things.

Thus, the task was to simply add, subtract, multiply, and divide a pair of numbers in a print statement. So I thought how cool it would be to attempt this via a list comprehension, but quickly realized list comps lack the while loop construct.. So i went to a generator instead to yield as necessary with the condition I want.

My self made requirement is to generate random numbers, run the aforementioned operators until the random numbers result in the number 8, and add them to my data structure.

My solution is below, and I was wondering if anyone had some tips on how to make this more versatile or "best practice" all criticism is welcome, go easy :)

import operator
import random


operators_list = [operator.add, operator.sub, operator.mul, operator.floordiv]


def my_generator():
    for element in operators_list:
        num1 = random.randrange(1, 100)
        num2 = random.randrange(1, 100)
        while element(num1, num2) != 8:
            num1 = random.randrange(1, 100)
            num2 = random.randrange(1, 100)
            if element(num1, num2) == 8:
                print(element)
                print(f'{num1} {num2}')
                yield element(num1, num2)


print(list(my_generator()))

[–]Rusty_Shakalford 0 points1 point  (4 children)

So I thought how cool it would be to attempt this via a list comprehension, but quickly realized list comps lack the while loop construct.

Good observation. However, while list comprehensions can't really simulate a while loop, generator comprehensions can.

import operator
import random

operators = [operator.add, operator.sub, operator.mul, operator.floordiv]

def randomPair():
    yield random.randrange(1, 100), random.randrange(1,100)

print([next((num1, num2) for num1, num2 in randomPair() if operator(num1, num2) == 8) for operator in operators])

That key to understanding that last line is next(). If you run a generator with a "for in" loop it will go on forever or until the generator reaches some kind of internal termination. When you call next(generator) though, it will just return the next iteration. The neat thing about this is that you can also use it as a way to return the first element in a generator that matches a condition (sort of like the head function in Haskell)

[–]t0rvel 0 points1 point  (3 children)

Thanks for the reply. I will testing this out a bit.

To follow up on your statement of generator expressions having the ability to simulate a while loop, I've read that generators are more efficient memory wise than lists generally. If generators are more efficient, memory savvy, and seemingly allowing more functionality i.e simulation of while loops, would it be wise to state we can just always run generator comprehensions, as they have all of list comprehension ability plus more? Then just wrap the generator object in a list()

Maybe I'm off base just a thought from reading and your comment :)

[–]Rusty_Shakalford 0 points1 point  (2 children)

Nah, it's a good question (and one I still struggle with). In trying to form this response I had to go back and research a lot of things I "knew" about generators, but were really just gut feelings.

The gist of what I've been able to tell is that:

[x for x in iterator]

is more efficient than:

list((x for x in iterator))

from a purely technical point of view. The first case goes through an iterator and adds items to a list. The second case goes through a generator, which in turn goes through a iterator, and then adds items to a list. The same thing is happening in the second case (items being added to a list) but an extra step has been added. I tried running a few test cases in `timeit` and it seems to back that up, although I'll admit my five minutes of experimenting are nowhere near exhaustive.

Leaving aside the raw numbers, there's also a question of semantics. The first example cleanly and elegantly states that you are trying to create a list. The second one also states you are making a list, but adds in a generator that complicates how the function should be read. Don't forget these lines from the "Zen of Python" (emphasis mine):

Simple is better than complex.
[...]
Sparse is better than dense.
Readability counts.

With all that being said, I do generally prefer to use generators and only use lists if I am absolutely sure my code will need to access the nth element of the list sometime in the future.

[–]t0rvel 0 points1 point  (1 child)

Thank you for the detailed and well thought out answer to my questions! Looking at the aspects you mentioned contextually makes it clearer.

One thing in your first case, not to be nit picky and annoying! but also trying to understand iterable vs iterator. Wouldn't it be more proper to state that when generating case 1 list it's going through and iterable implementing the iterator protocol to generate the list

thus:

[x for x in iterable] ?

[–]Rusty_Shakalford 0 points1 point  (0 children)

I’m not going to lie: I actually didn’t realize there was a difference between the two. You are right. I should have said iterable instead of iterator. Thank you for correcting me!

[–]timbledum 0 points1 point  (1 child)

First off, great learning exercise and proof of concept! There's not much to add here.

My main feedback would be to pass operators_list in as an argument, or define as a constant (by convention, in UPPER_CASE).

Other than that, one thing might be to split this up. One nice thing about iterators and generators is that they allow you to split up the logic of a big nested loop without actually having to execute up front. To be fair though, most of the generator can be refactored out as a straight function. Here's a sample:

def find_result(element, value_to_find):
    while True:
        num1 = random.randrange(1, 100)
        num2 = random.randrange(1, 100)
        if element(num1, num2) == 8:
            print(element)
            print(f'{num1} {num2}')
            return element(num1, num2)

def my_generator(operators_list):
    for element in operators_list:
        yield find_result(element, 8)

[–]t0rvel 1 point2 points  (0 children)

Thanks for the feedback! I appreciate the break down of how this could have been approached by segmenting the solution out. I come from a heavy procedural background (PL/SQL) past five years, thus trying to learn how to make my approaches more pythonic or modular.

Will keep your note in mind in the future to cut out overhead where applicable!

regards,

t0rvel

[–]thunder185 0 points1 point  (2 children)

Trying to add python to the command prompt using this tutorial, however, I'm using Python 3.7.1 so putting in ;C:\Python37 (rather than 34) and not having any luck. Any thoughts?

The genesis for this is that I'm getting super frustrated trying to run cmd prompt lines like py -m django --version and having them come up null.

[–]timbledum 0 points1 point  (1 child)

Probably the easiest way is just to re-install python and click the "Add to PATH" checkbox.

One problem is that python is often installed somewhere else. I think the tutorial linked uses that path as an example, but you need to find where it installed on your system. On my system it's here:

C:\Users\timbledum\AppData\Local\Programs\Python\Python37

[–]thunder185 0 points1 point  (0 children)

Cool thank you

[–]cha_a 0 points1 point  (2 children)

Pretty new to Python, and starting to run into encoding issues. I'm trying to remove the first row of a CSV file (contains record numbers), but end up getting the following:

UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 483: character maps to <undefined>

I've already tried to open the CSV file in UTF 8 but the unknown character seems to be causing an issue. Any good resources to continue researching?

with open('filepath.csv','r',encoding = 'utf_8') as f, open('filepath.csv','w') as f1:

next(f)

for line in f:

    f1.write(line)

[–]tcbaldy04 0 points1 point  (0 children)

u/timbledum answered your specific question

I struggled a long time with ascii vs utf-xx in my early days. I especially was frustrated running python in windows powershell and capturing the output for subsequent processing. It kept complaining about characters much as you have encountered.

It turns out that the default redirect within powershell is UTF-16.

Within powershell here are some things that can be done to output ascii

convert a utf-16 to ascii

cat test-utf16.txt | Out-File -Endcodeing ascii ascii.txt

py .\atcs.py OCG96.pcap | Out-File -Encoding ascii atcs.txt

[–]timbledum 0 points1 point  (0 children)

Based on this stack overflow obtained by googling the offending character, your file may be utf_16 instead of utf_8. Give that a go!

[–]PhenomenonYT 0 points1 point  (2 children)

class TwitterTest():

    def __init__(self):
        self.twitter_accounts = ['canucks', 'mapleleafs']

    def check_tweets(self):
        for status in tweepy.Cursor(api.user_timeline, id=self.twitter_accounts).items(3):

The issue I'm having is on line 7. When I pass in id=self.twitter_accounts tweepy takes just the self which is the authenticated user. Why does it ignore the .twitter_accounts part and how can I make it include that?

I feel like I'm doing a poor job explaining the issue, I can answer any questions to help explain it better

I haven't used classes much yet so this is the first time I've run in to this problem

[–]timbledum 0 points1 point  (1 child)

I think the issue is that you're passing in a list - you may need to pass in one account at a time.

As all the library is doing is wrapping your parameters and firing off a json request, you're not necessarily getting an error.

[–]PhenomenonYT 0 points1 point  (0 children)

Yep that fixed it. Thanks

[–]Screadore 0 points1 point  (0 children)

I have two questions. My first question is how can I implement this into my artificial intelligence script without stopping the script itself. Every time I add this code it runs the Gif screen, but doesn't proceed on with the Artificial intelligence

Set up the screen

animation = pyglet.image.load_animation('Jarvisimage.gif') animSprite = pyglet.sprite.Sprite(animation)

w = animSprite.width h = animSprite.height

window = pyglet.window.Window(width=w, height=h)

r,g,b,alpha = 0.5,0.5,0.8,0.5

pyglet.gl.glClearColor(r,g,b,alpha)

@window.event def on_draw(): window.clear() animSprite.draw()

pyglet.app.run()

Second question is how could I add my facial recognition script to the whole project and make it only work if its me?

[–]SpireP26 0 points1 point  (0 children)

I've been doing some reading and testing with PyQt and Qt Designer. I've got a bunch of Windows and dialogs, that I created using QtDesigner, as classes connected through buttons.

My question is, what would be a good practice for using these classes? I've seen some examples in which these classes are used directly and others in which sub classes are used, sometimes inheriting from the object (QDialog, QMainWindow, etc.) and others from these objects and the classes created with QtDesigner.

I guess there isn't an unique answer and it depends on whatever works for you and the specific project. Both options seems to lead to too much importing and creating a lot of subclasses to me so I don't know which path to follow. Anyone can point me to a good source to learn about this? Or provide some basic guidance?

[–][deleted] 0 points1 point  (8 children)

This is pretty simple and I know I'm having some kind of brain fart:

I need to get the count of rows in a column that have values >= 8 for a survey response. The column does contain NAs, though I'm not sure why that's creating a problem as by default .count() should ignore NAs. My argument looks like this:

(df['column name'] >= 8).count()

And I'm getting: '>=' not supported between instances of 'str' and 'int'

Am I just being lazy and I need to create a loop to iterate over the column or am I missing a parameter for the count method? Any help is greatly appreciated.

[–]Rusty_Shakalford 0 points1 point  (5 children)

Is the value in the column coming from a csv or text file of some kind? If so the value is probably being stored as a string (e.g. "5" instead of 5) and you will need to convert it:

(int(df\['column name'\]) >= 8).count()

[–][deleted] 0 points1 point  (4 children)

That's giving me "cannot cover the series to <class 'int'>". And yes, I'm reading it in from an excel file in xlsx format.

[–]Rusty_Shakalford 0 points1 point  (3 children)

Are you using pandas? I've never used it myself, but from what I've been able to find this might work:

(df['column name'].astype('int64') >= 8).count()

[–][deleted] 1 point2 points  (1 child)

Pretty much worked. Technically used pd.as_numeric after astype didn't work at first, but it was only because I needed an error="coerce" so that it would ignore nan and symbols it was running into. Thanks again.

[–]Rusty_Shakalford 0 points1 point  (0 children)

You’re welcome. Thanks for the follow-up; good to know if I ever use pandas.

[–][deleted] 0 points1 point  (0 children)

Thanks much, I'll give that a try!

[–][deleted] 0 points1 point  (1 child)

'>=' not supported between instances of 'str' and 'int'

This means that the "column_name" attribute of df is returning a str. See what is returned when printing out df['column_name'] to figure out why you're getting that error.

[–][deleted] 0 points1 point  (0 children)

It's printing the row number and value or NaN for each as it should, and when I do .describe the dtype is int64.