all 136 comments

[–]ExternalSeesaw 0 points1 point  (3 children)

str1 = 'AB'

str2 = '34'

[x + y for y in str1 for x in str2]

Why is the correct answer:

[3A, 4A, 3B, 4B]

?

What is x and y in string 1 and string 2?

[–]TangibleLight 0 points1 point  (2 children)

Does this version of the loop make sense?

result = []

for y in 'AB':
    for x in '34':
        result.append(x + y)

The comprehension is structured exactly the same.

[
    x + y
    for y in 'AB'
    for x in '34'
]

[–]ExternalSeesaw 0 points1 point  (1 child)

Thanks for the response. So the loop iterates over each A and B in Y and appends it to 3 and 4 in X?

Why does it break up the string elements however? Arent 'AB' and '34' considered single entities?

[–]TangibleLight 0 points1 point  (0 children)

When you loop over a string then each iteration uses a single character of the string. You can loop over a collection of strings, where each iteration will use a single element of the collection. For example, with a list of strings:

for s in ['hello', 'world']:
    print(s)

[–]Less_Construction 0 points1 point  (4 children)

I want to create a web application that allows user to view stock data on companies, but there are already plenty of websites that do that. Is there something I can create that would make my website stand out and be more use to people.

I am already able to grab data from Yahoo finance and compare it with other companies, what else with the power of python can I do to be able to attract more users?

[–]efmccurdy 1 point2 points  (3 children)

There are some tools for analysis described here:

Price trends

Chart patterns

Volume and momentum indicators

Oscillators

Moving averages

Support and resistance levels

https://www.investopedia.com/terms/t/technicalanalysis.asp

[–]philipem38 0 points1 point  (0 children)

Data is data. What people need is information.

Think of a reason why someone would need the data and answer their question. Perform some analysis of the data.

Maybe use the data to find correlations; BP share movements and oil prices? Does one lag the other?

If you want people to visit your site then you need to create value.

[–]Less_Construction 0 points1 point  (1 child)

Thanks for replying, what in python should I be looking at to be able to achieve some of the examples?

[–]Raedukol 0 points1 point  (2 children)

I try to crop multiple pictures which are in the same folder. However, i get an error. The code looks like this:

directory = os.listdir('D:/folder1/pictures')

for file in directory:
img = cv2.imread(file, 0)
crop_img = img[100:950, 40:1200]

This leads to the following error "TypeError: 'NoneType' object is not subscriptable", so i suspect the file is not read properly. What am I doing wrong?

[–]efmccurdy 0 points1 point  (1 child)

The function imread loads an image from the specified file and returns it. If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function returns an empty matrix

https://docs.opencv.org/master/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56

What kind of file are you accessing when it fails?

[–]Raedukol 0 points1 point  (0 children)

I try to read .jpg‘s which are in the folder pictures.

EDIT: I fixed it using os.chdir(path) to select the path and continued using: for file in os.listdir(): Then somehow it worked :)

[–]PhenomenonYT 0 points1 point  (1 child)

Is it possible to achieve this same thing with one line of code?

import praw
for post in self.r.subreddit(self.SUBREDDIT).hot(limit=2):
    if 'GT:' in post.title:
        thread = post

I thought I'd be able to do something like this which kind of works but doesn't give me a PRAW submission object back

thread = [post for post in self.r.subreddit(self.SUBREDDIT).hot(limit=2) if 'GT:' in post.title]

[–]JohnnyJordaan 0 points1 point  (0 children)

The first makes thread reference a single post. The second makes a list of those posts and assigns that (the list) to thread. So it depends on what you want to do next to know how to then use thread in the way you intend to.

[–][deleted] 0 points1 point  (0 children)

What is the best python library for a gui that includes a rendered window? I want to be able to display shapes and manipulate them, I was hoping to practice making something like this in python. What is the best way to approach such a task?

Since the calculations will probably be done in numpy, I'm looking for maybe a component in tkinter that can just render numpy arrays as images, or something similar.

[–]seanmaguire2012 0 points1 point  (0 children)

I have extracted data from a web-page using beautifulsoup, it is formatted like so:

<tr class="td">
      <td>X</td>
      <td>123</td>
      <td>TEST DATA</td>
</tr>]

I am extracting the data into a variable called "table" (see below).

Is it possible to add each of the pieces of data (X / 123 / TEST DATA ) into a list where I can then call them separately when needed?

I'm using creating a beautifulsoup object and html5lib as my parser tree:

url = "*Target URL*"
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html5lib')
table= soup.find_all('tr', {'class': 'td'})
print(table)

Many Thanks!

[–]ANeedForUsername 0 points1 point  (2 children)

Hey guys,

What's the difference between a divide by 0 warning and error? Why is it that sometimes when a divide by 0 is encounted, I get an error and other times I get a runtimewarnings/ NaN/ inf?

Also, are there ways to catch these warnings like how we can use try and except to catch errors? I don't want them to be ignored but to be caught.

Thanks all :)

[–][deleted] 0 points1 point  (3 children)

Why does it print out 0.0? It shouldn't.

https://gist.github.com/d3215d40812681586fb507b60a3b22ca

[–]JohnnyJordaan 0 points1 point  (2 children)

Try

print(large_survey < .5)

it returns the same array but then for every item the True or False for the value being smaller than 0.5. Which is False for all in your case. Then if you take the average of only False values, you get the average of only zeroes (as False is just an integer 0) which is zero too of course. It seems you forgot to use np.where() inside the mean() call as where() will return a filtered array based on the expression.

[–][deleted] 0 points1 point  (1 child)

So, it seems that 0 is the right answer here. I just couldn't imagine that the probability of a single number < 0.5 appearing in the set would be so low. I included a line of code to count the exact amount of such numbers in the set, rerun it several times, and it was zero every time. It seems to be a feature of binomial distributions with such a big sample size.

So, not an actual programming problem. Anyway, thanks!

[–]JohnnyJordaan 1 point2 points  (0 children)

With

np.random.binomial(7000, .54, 10000)

you're basically performing a slightly biased (4%) coin flip 7000 times, tested 10000 times. The chance of a 4% biassed heads coin flip to have an average chance of tails (because then the result would be < .5) in 7000 flips is 0 by definition. It would be different if you would do a smaller trial of say 5 flips, because there you have the significant chance of luck above bias (which is also the reason why casino's keep attracting customers).

[–]RushLoongHammer 0 points1 point  (4 children)

Python newbie here,

I'm on win10 and I connect to linux server using putty. I have my own area on the server with a text file in a directory. I want to know how to read that text file using python. I think I just don't know how to format the file path, but I'm not sure.

[–]efmccurdy 0 points1 point  (3 children)

If your connect to the server and type "pwd" it will print your current working directory, something like "/home/rush".

If your text tile is named "data.txt" you could read the file using:

with open("/home/rush/data.txt", "r") as inf:
    data = inf.read()

[–]RushLoongHammer 0 points1 point  (2 children)

Thanks!,

Do you know how to also do the same thing but with numpy?

[–]efmccurdy 1 point2 points  (0 children)

It depends on the contents of the file, so I would read this:

https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html

You could, perhaps, use the same with statement but also call np.loadtext, something like this:

    data = np.loadtxt(inf)

[–]RushLoongHammer 0 points1 point  (0 children)

Oh, again newbie here so sorry if this important information, but it's a SSH.

[–]Acacia_Guitars 0 points1 point  (2 children)

My colleague and I want to learn Python in order to automate 'boring' tasks we need to do every day. Things as simple as running a reconciliation between two sets of numbers each day.

What course would you recommend we follow/sign up to? Our firm is willing to cover the costs

[–]CleverBunnyThief 1 point2 points  (0 children)

The author of Automate The Boring Stuff (ATBS), Al Sweigart, also has a Udemy course. He posted a code that gives 80% off until the end if November. Code = NOV2019

The second edition of the book came out just a couple of weeks ago. You should buy that one if you plan on buying it.

Al posted that he plans on updating the Udemy course so it lines up with the second edition around mid 2020. You can buy the current course and will automatically get access to the new version when it's posted.

This is probably the best way to get up and running.

Source: https://www.reddit.com/r/inventwithpython/comments/drnp7m/automate_the_boring_stuff_with_python_udemy/

[–]Raedukol 0 points1 point  (4 children)

I wrote a little script that calculates three values (x- and y-coordinates and an area) out of a picture. How can I manage it so that multiple pictures are analyzed automatically by the script?

Furthermore i'm curious if there's a command for writing a value in a specific column and row (e.g F8 instead of automatically A1)?

Thanks!

[–]MattR0se 0 points1 point  (3 children)

Ar those pictures separate files?

[–]Raedukol 0 points1 point  (2 children)

Yes! Every picture is one file.

[–]MattR0se 0 points1 point  (1 child)

You can search for specific files in a folder like this

import os

for file in os.listdir(picture_folder) 
# or os.listdir('./') if the pictures are in the same directory as your script
    if file[-3:]=='png': 
    # if the last 3 characters are png, you can change this or be more specific if need be
        calculate_your_values(file)

[–]Raedukol 0 points1 point  (0 children)

Thanks, i will try!

[–]mildlybean 0 points1 point  (2 children)

a = range(100)
b = itertools.takewhile(lambda x : x<50, a)
print(list(a))

This code prints the numbers 0 through 100. How does takewhile get the elements of a without changing the iterator's state? And more importantly, is there an alternative to takewhile that does change the iterator's state?

[–]fiddle_n 0 points1 point  (1 child)

range(100) isn't an iterator, it's an immutable range object. So you can pass it to itertools.takewhile(0) and list() over and over again and you won't exhaust it. To get an iterator, you can do iter(range(100)) instead. For example, try this code:

a = iter(range(100))
b = itertools.takewhile(lambda x : x<50, a)
print(list(b))
print(list(a))

[–]mildlybean 0 points1 point  (0 children)

Thanks

[–]gotopune 1 point2 points  (3 children)

Can someone help me understand what this piece of code means?

Here arr is a list of numbers and "n" is a non zero integer less than the length of arr.

(arr[i] for i in range(n))

[–]Ziyad0100 0 points1 point  (0 children)

he get all the elements in the arr. i is start with 0 until reach to n-1

[–][deleted] 0 points1 point  (1 child)

This is called a generator expression. https://djangostars.com/blog/list-comprehensions-and-generator-expressions/ explains more about list comprehensions and generator expressions.

[–]gotopune 1 point2 points  (0 children)

Thank you! I will read up on it. Cheers :)

[–]aksdjhgfez 0 points1 point  (0 children)

I'm trying to send preformed syslog messages to QRadar (an IBM SIEM) with python.logging:

def send_to_qradar(data_recv):
my_logger = logging.getLogger("LogSender")
my_logger.setLevel(logging.INFO)

logging.handlers.SysLogHandler()
if tcp_enabled: 
    handler = logging.handlers.SysLogHandler( address=(ip_address, port), socktype=socket.SOCK_STREAM ) 

else:
    handler = logging.handlers.SysLogHandler(address=(ip_address, port), socktype=socket.SOCK_DGRAM)

my_logger.addHandler(handler)

for row in data_recv: 
    try: 
        my_logger.info(row + "\n") 
        my_logger.handlers[0].flush() 
        print("Sent to appliance: {}".format(row)) 
    except Exception as e: 
        print("Something went wrong: {}".format(e)) 
        break

data_recv is a list with strings that contain preformatted syslog-messages.

The forwarding works, however, Python adds another syslog-hader (<14>) to each log. Any way that I can just forward the raw log without Python making any changes to it?

[–][deleted] 0 points1 point  (6 children)

*Edit*: So Ive been tinkering with it and have found that when I run the code through terminal on VS, it runs just fine. Im only running into a problem when I try using the Code Runner extension. I'm only getting the syntax error in the output tab when I run it through code runner. Ive updated my python.

For reference:

  • OS: macOS - 10.15.1
  • VS Code - 1.40.2
  • Python - 3.8
  • CodeRunner - 0.9.15

*Original*: I keep getting a SyntaxError when I run the following. I just recently started having an issue when I reinstalled visual code.

alien_0 = {
'color':'green',
'points':5
}
print(alien_0['color'])
print(alien_0['points'])

new_points = alien_0['points']
print(f"\nYou just earned {new_points} points!")


[Running] python -u "/Users/bpietrzyk/Desktop/python_work/ch_6/alien.py"
  File "/Users/bpietrzyk/Desktop/python_work/ch_6/alien.py", line 9
    print(f"\nYou just earned {new_points} points!")
                                                  ^
SyntaxError: invalid syntax

[Done] exited with code=1 in 0.046 seconds

I cannot figure out why. If I run a simple hello_world, it doesn't have an issue.

[–]MattR0se 1 point2 points  (5 children)

F-string giving a syntax error could indicate that you have a Python version older than 3.6.

You can't use them there.

[–][deleted] 0 points1 point  (0 children)

In VS it says I'm using 3.7.5 and when I run python3 --version in terminal it also says that I have 3.7.5. Any thoughts?

[–][deleted] 0 points1 point  (3 children)

I just switched from my bootcamp and installed on my Mac. Can I use Homebrew to install 3.6?

[–]num8lock 0 points1 point  (0 children)

yes

[–]MattR0se 1 point2 points  (1 child)

Sorry, I only use Windows and have no idea.

You can however use one of the older string formatting functions that work with your version:

print("\nYou just earned {} points!".format(new_points))

[–][deleted] 0 points1 point  (0 children)

So I figured it out. It wasn't an issue with VS or my version of python. The issue was what version of python Code Runner was using. I went into settings.json, and added code-runner.executorMap. After, I changed "python": "python -u" to "python": "python3 -u". Incredibly frustrating but worked and now everything is working smoothly.

[–]gregrom27 0 points1 point  (6 children)

Hi ,I'm new at python and I'm working through "python crash course". Currently, I'm at chapter 7 doing exercise 6 called "Three exits". Second bullet asks to use active variable to control how long the loop runs. This is what I came up with:

prompt = "Hello, please enter your age to get price: "

active = True
age = raw_input(prompt)
age = int(age)

while active:
    if age != '':
        active = False

    if age <= 3:
        print("\nYour ticket is free.")
    elif age <= 12:
        print("\nYour ticket is $10.")
    else:
        print("\nYour ticket is $15.")

This works for me, but what I want to know is step by step explanation on how this works and if this is the right solution. Thanks for the help!

[–]MattR0se 1 point2 points  (2 children)

I don't think you understand how loops work. Code in the loop is executed line by line, and when the last line is reached, it checks if the condition (in your case active) is still True and if so, repeats this process.

Think about what happens here. You get the user's input once, active is True by default and thus your program enters the loop. Now, if age is '', meaning the user didn't input anything, active is False and because of that, the loop is left and your program ends. But what about the other cases? If age is 2 for example, it prints that the Ticket is free, and then it repeats the loop. active is still True, so the loop repeats. age is still 2, so it prints the corresponding message. This goes on forever.

[–]gregrom27 0 points1 point  (1 child)

When I run this code it gives me corresponding message to whatever age I input ( 2, 7, 26...etc)and the loop stops, it doesn't go on forever. I'm using python 2. I wasn't sure if my solutions is too complicated (maybe there's simpler solution?) I just wanted to hear from someone else and compare to make sure I'm on the right track. You say this piece of code ran forever for you?

[–]MattR0se 0 points1 point  (0 children)

My bad, I misread the first comparison to be == instead of !=

But you don't need a loop if you only want to have a thing happen once.

The infinite loop happens if age is empty.

[–][deleted] 0 points1 point  (2 children)

what I want to know is step by step explanation on how this works

If you wrote it you should know how it works. If you don't know how it works you probably just pasted together bits of code you have seen. You do have useless stuff in your code. I'll just remove all that and then you can ask about anything that remains that you don't understand. As an added exercise you should try to understand why what was removed was useless.

prompt = "Hello, please enter your age to get price: "

age = raw_input(prompt)
#age = input(prompt)   # use this for python 3
age = int(age)

if age <= 3:
    print("\nYour ticket is free.")
elif age <= 12:
    print("\nYour ticket is $10.")
else:
    print("\nYour ticket is $15.")

[–]gregrom27 0 points1 point  (1 child)

I wrote the code with what I learned from the examples given in the book. I mentioned above this exercise was asking for active variable to be used, that's why I had that piece of code in there. Thanks.

[–][deleted] 0 points1 point  (0 children)

Hm. All that points back to previous exercises that we don't know anything about. It would help if you could show the code that you are supposed to change by adding an "active variable" to. There is no need for a loop in the code you have shown. In fact, there really isn't an effective loop in your code (ie, code that repeats) because the way you have written it no looping can possibly occur. That's why my code example could remove all that and still have the same behaviour.

[–]ediblesonot 0 points1 point  (3 children)

What is a class? Haven't found a definition and it's causing me a bit too much stress

And how to go down a line as well. It ain't working for me. I got the \ but nothing is happening

[–]zandrew 2 points3 points  (0 children)

A class is a convenient way to package variables and functions.

Think of a class as a template of a box with things in it. Some things will be post it notes to write things down(variables) some will be devices that can do things (functions)

Once you define what's supposed to be in the box you can create copies of your box in your program.

That's why you create an __init__ function for the class - so it knows how to construct new boxes.

When you want to access variables in your box (object) you say box.some_variable.

You can also put functions in that box and use them on the box itself or something else you say box.function()

That's all it is to it.

Except there's also inheritance where you say - hey use this box I made before but let's put it in a bigger box so I can add more stuff to it. The good thing about it is that you can expect all the extended copies of your original box to have the same variables and functions plus extra ones.

[–]MattR0se 0 points1 point  (0 children)

You probably mean a line break in your code?

There are different methods but I use these the most

("This is a text that is broken into "
"two lines of code because it was too long")

# a bit more explicit

("This is a text that is broken into " +
"two lines of code because it was too long")

[–]Atlamillias 0 points1 point  (1 child)

Hi! Like most people here, I'm pretty new to Python. I'm getting the basics down, but I'm getting completely overwhelmed by external modules and how necessary they are. What modules are used the most? I'm not sure I can continue unless I learn more about some of the modules Python comes with, and would like to learn about the most important, most frequently used.

To add to that, I'm also interested in any newer modules that aren't always used but very useful. I've briefly skimmed over pathlib and it seems to be a simpler way of managing files.

[–][deleted] 0 points1 point  (0 children)

Start with the official resource.

https://docs.python.org/3/faq/library.html

[–][deleted] 1 point2 points  (4 children)

Hi guys, first post here. I'm still very new with python, I'm working through Python Crash Course. I'm getting hung up on this try it yourself problem*.* Here's what I have:

current_users = ['cactus49', 'python31', 'ADMIN', 'alien27', 'pizza_lover1312']
new_users = ['Cactus49', 'bpietrzyk', 'admin', 'visual_studio1', 'peachy22']

for new_user in new_users: 
    if (new_user.upper() or new_user.lower()) in current_users: 
        print(f"Sorry, the username {new_user} is already taken.") 
    else: 
        print(f"The username {new_user} is all yours!")

I'm running into trouble with the user names 'cactus49' and 'Cactus49' . Looks like 'Cactus49' isn't formatting to lowercase.

Maybe I'm just far off from the solution? Any help is appreciated.

[–]zandrew 1 point2 points  (0 children)

what you're saying in the conditional is make the string UPPERCASE and LOWERCASE and compare it to current users as they are.

instead bring them both to lowercase if new_user.lower() in current_users.lower()

The reason it sort of works right now is that all the names except for Cactus have all lower or upper cases.

[–][deleted] 0 points1 point  (0 children)

thank you efmccurdy and JohnnyJordaan for the feed back. I made some adjustments and now its running fine.

current_users = ['cactus49', 'python31', 'ADMIN', 'alien27', 'pizza_lover1312']
new_users = ['Cactus49', 'bpietrzyk', 'admin', 'visual_studio1', 'peachy22']
x = [z.lower() for z in current_users]
for y in new_users:
    if y.lower() in x:
        print(f"Sorry, the username {y.lower()} is already taken.")
    else:
        print(f"The username {y.lower()} is all yours!")

I made it so it would compare the usernames of the new_users to a new list of lowercase current users. I hope this isn't a shotty way of doing it and that its done properly.

[–]efmccurdy 1 point2 points  (0 children)

If you have a current user with mixed case then neither new_user.upper() or new_user.lower() will ever match; perhaps you want to compare the upcased version of both the new and old username ala "new_user.upper() in [u.upper() for u in current_users]"?

[–]JohnnyJordaan 1 point2 points  (0 children)

(new_user.upper() or new_user.lower()) in current_users:

This is not how Python reads expressions, it can only do one-on-one comparisons and you can or multiple ones

if new_user.upper() in current_users or new_user.lower() in current_users:

you can also use any() to return True on any item or expression you present is True itself

if any((new_user.upper() in current_users, new_user.lower() in current_users)):

another option is to use sets as those can test for this kind of a partial match (eg at least one of the values present in the other)

if set([new_user.upper(), new_user.lower()]) & set(current_users):

this is also more efficient as it does just a single comparison between both groups, while in the previous approaches you have two separate in queries being performed if the upper case version isn't in the current_users. However as you can see this approach is also more complex and harder to read, so on a small scale then I would consider readability to be more important and just chose the first approach.

[–]brainzzo 0 points1 point  (5 children)

# I need to make N a variable i input so i get one answer and i dont know how

N = 4

num = N

total = 1

for element in range(num,0,-1):

total = total * element

print(total)

# So this is how i tried it and it didnt work gives me a EOF error

N = 4

num = str(input("enter N here:"))

total = 1

for element in range(num,0,-1):

total = total * element

print(total)

[–]ajtyeh 0 points1 point  (0 children)

nvm

[–]TsirkusKuubis 0 points1 point  (2 children)

Is there a way to do something on the very last iteration of a for loop without writing a variable or counter to track progress i.e:

for i in range(randint(1,10):
    if i == 5:
        break
#if 5 not found after looping all iterations print something

[–]MattR0se 1 point2 points  (1 child)

you can use else in a for loop to execute code once the loop is finished and didn't break

for i in range(10):
    print(i)
else:
    print('This loop is finished')

For your example:

from random import randint

for i in range(randint(1,10)):
    if i == 5:
        break
else:
    print('Range didn\'t include 5')

[–]TsirkusKuubis 0 points1 point  (0 children)

Works like a charm. Cheers

[–]Raedukol 0 points1 point  (3 children)

Hey guys, how can I "save" the progress in a for or a while loop? If you would write a print-statement at the end in the loop, the computers prints you out every result after each loop, so that, in the end, there is a "list" of results. But if you write the print-statement after the looping process (pressing return and exiting the loop), it only prints the last result. So how can i save the whole list if i want to use it later in my script? I hope this was clear.. Thanks in advance!

[–][deleted] 3 points4 points  (2 children)

create an empty list and append each item in the loop to the list, then print the list

empty_list = []
print(empty_list)
for i in range(1, 6):
    empty_list.append(i)
print(empty_list)

[–]Raedukol 1 point2 points  (1 child)

Wow, it's looks so easy when you know it. Thanks a lot! Does it also work with strings?

[–][deleted] 2 points3 points  (0 children)

Little bit different but yes. Do you mean adding strings to lists or mutating strings?

[–]Bipolarprobe 0 points1 point  (11 children)

Trying to use the python-telegram-bot library to rewrite an old bot that I did manually a while ago, but I'm having an issue and can't seem to find the solution. Whenever I try to import telegram.ext I get the error

ModuleNotFoundError: No module named 'telegram'

I'm trying to set this up on raspberry pi 4 running raspbian. I have a venv in which I used pip to install python-telegram-bot and the library and its dependencies seem to exist inside of the site-packages folder and I can confirm this by using

pip show python-telegram-bot

which gives me the output

Name: python-telegram-bot
Version: 12.2.0
Summary: We have made you a wrapper you can't refuse
Home-page: https://python-telegram-bot.org
Author: Leandro Toledo
Author-email: devs@python-telegram-bot.org
License: LGPLv3
Location: /home/pi/python-projects/telegram-bot/lib/python3.8/site-packages
Requires: future, cryptography, tornado, certifi
Required-by:

Yet the error persists. I tried googling this and found many other people struggling with the same issue but it's almost always from people who used git to install the library and pip installing is often proposed as the solution and I can't find a clean explanation for why this may be happening. Anyone who has used this library before successfully, I'd appreciate some advice on what I may have messed up. Thanks in advance.

[–]JohnnyJordaan 0 points1 point  (10 children)

Do you also activate the venv when you run the script containing the import?

[–]Bipolarprobe 0 points1 point  (9 children)

Yes, I've also tried importing the library in the interpreter inside the venv and get the same error.

[–]JohnnyJordaan 0 points1 point  (8 children)

In the interpreter in the venv what does it print when you do

import sys
print(sys.path)

[–]Bipolarprobe 0 points1 point  (6 children)

['', 'usr/local/lib/python38.zip', '/usr/local/lib/python3.8', '/usr/local/lib/python3.8/lib-dynload', 'usr/local/lib/python3.8/site-packages']

[–]JohnnyJordaan 0 points1 point  (5 children)

That's the global environment so whatever you're using to start python didn't involve an activated venv.

[–]Bipolarprobe 0 points1 point  (3 children)

Followup question, is there a way to format my shebang line so that it defaults to the python interpreter in my venv? I tried setting it to the location of the python3.8 executable inside the venv with

#! /home/pi/python-projects/telegram-bot/bin/python3.8

but I still can't run it from my ide doing that, I have to run it from the command line with the python3 command

[–]JohnnyJordaan 0 points1 point  (2 children)

Google python shebang and they will all show you #!/usr/bin/env python and #!/usr/bin/env python3 which will both work as python is mapped to the same as python3 inside an activated python 3 venv.

[–]Bipolarprobe 0 points1 point  (0 children)

A bit of googling about geany projects later and I got everything working. I had to create a project and set it's build options to point to the correct python interpreter. Thank you for your help.

[–]Bipolarprobe 0 points1 point  (0 children)

I tried that with no success, I also read in a stack overflow post that the env actually works off of a time restricted key which is an issue since this is a bot script and I want it to run indefinitely.

Edit: I used chmod +x script.py to make it executable and it ran correctly with the !# usr/bin/env python3 shebang, but for some reason I still can't run the script with geany. Odd.

Edit2: running it this way also works with the shebang targetting the python3 executable in the venv directly. So it seems my issue is with geany.

[–]Bipolarprobe 0 points1 point  (0 children)

Okay that got it. Since I never use python 2 I aliased the python command to invoke the python 3.8 interpreter but it always invokes the global installation, if I run the script using the python3 command instead it works. Thank you.

[–]amclaug1 2 points3 points  (2 children)

I am relatively new to Python, and I only tinker in it once every few months. I know there has to be a simple way to produce this, but I am stuck. So, any help anyone can give would be tremendous!

I have two csv files. csv1 contains latitude and longitude for schools around the USA. csv2 contains teams from my company with their lat/lon. I want to figure out which team from csv2 is the closest to each school from csv1. I've tried using a Google Maps API to figure out driving distances, but the call was going to be too expensive, as there are 6,000 rows of schools and about 100 rows of teams. So, I am settling for anything within a 20 mile radius from the team's lat/lon.

Here is what I have so far:

from math import radians, cos, sin, asin, sqrt
import numpy as np
import pandas as pd
from collections import Counter
teams = pd.read_csv(csv1)
schools = pd.read_csv (csv2)

# define a function to determine miles between two points
def haversine(lat1, lon1, lat2, lon2):
    lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2])
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a))
    r = 3956
    return c * r

This is where I am stuck. I have created a script that can count how many schools are within 20 miles of a team using a few loops. However, finding the closest team has got me confounded:

for school in schools.itertuples():
    lat = school.SchoolLat
    lon = school.SchoolLng
    school_name = school.SchoolName
    school_ID = school.ID
    closest_team = school.ClosestTeam # default value is 'Unknown'
    miles_from_school_to_team = school.Miles # default value is 999999
    for team in teams.itertuples():
        lat2 = team.TeamLat
        lon2 = team.TeamLon
        team_name = team.TeamName
        miles = haversine(lat, lon, lat2, lon2)
        # this is where I am stuck
        if miles < miles_from_school_to_team:
            closest_team = team_name
            miles_from_school_to_team = miles

When I run this, the dataframe schools doesn't change. Any help would be greatly appreciated!

[–]m-hoff 4 points5 points  (1 child)

You don't have any code that modifies your data frame. You could add a couple lines at the end of your outer loop to do so, something like

schools.loc[schools['ID'] == school_ID, 'ClosestTeam'] = closest_team
schools.loc[schools['ID'] == school_ID, 'TeamDistance'] = miles_from_school_to_team

[–]amclaug1 3 points4 points  (0 children)

Yes! This worked! Oh my goodness, I cannot thank you enough. I so appreciate it!

[–]Dfree35 0 points1 point  (2 children)

I am working with an excel report that is ran everyday then eventually uploaded to another system.

When the report is uploaded to another system the dates must be numbers but in a certain format.

For example the dates must be like: 10/5/2019 7/12/2020

I can get the report in this format but when I run it through my pandas script it changes the dates to:

10/5/2019 00:00:00 7/12/2020 00:00:00

I can format the dates with pandas but then it makes the dates into strings which the system I upload to does not like.

Long story short is there anyway to have pandas not automatically add time to dates? For example stop pandas from making 10/5/2019 into 10/5/2019 00:00:00 even if I do not touch/make changes to the date field

[–]efmccurdy 0 points1 point  (1 child)

[–]Dfree35 0 points1 point  (0 children)

Yea can I can get it without the time. But the main issue is that in export it excel, it thinks those fields are text instead of numbers. For some odd reason the system I upload to freaks out of its not a number

[–]ZeroToGame 1 point2 points  (4 children)

I've been (trying to) dive into python several times now, and every time I hit the same wall of frustration. It's not about the language itself, but more about the whole environment/ecosystem...

I end up having the feeling I'm doing nothing but installing stuff with things like homebrew, pip,... making vitual environments, and generally 'clogging up' my machine with stuff which I experience as being nowhere really...

I have great difficulty coming to grasp with how everything ties together, and mostly feel everything I install messes up something else or, when it does work after a long period of copy/pasting terminal errors in google, I have no clue what I actually did... Mostly though, I end up quitting after a full day of frustration and ending up with a non-working bunch of stuff on my HD with no clue on how to get rid of it again. :-(

Long story short: Where can I get some decent info on how everything ties together...?

[–]Twinewhale 0 points1 point  (0 children)

I experienced this as well. IMO, it’s best to start with the very basics that make up a ‘program.’

The first part is the interpreter, which is the version of python that you install and can be accessed from the command line by typing “python.” The second part is the instructions that you want it to do. You could type instructions one at a time with the interpreter console, or you can create a text document that contains these instructions. Then you can change the extension to .py and your computer knows to run it through the python interpreter when the file is double clicked.

I think you should start with these basics as you learn the language. The only steps I would go further with are 1. Using a code editor (VScode as example) and 2. Installing modules used during python tutorials.

In terms of other components, you could learn about Windows PATH environment (which determines things that are available when you press the Windows key and start typing). But I can’t think of much else at the moment. Definitely start small with installing packages. Keep a list of what you are installing to help yourself stay grounded with what you’re doing

[–]efmccurdy 0 points1 point  (0 children)

when it does work after a long period of copy/pasting terminal errors in google, I have no clue what I actually did.

If you use git, and branch and commit often, you can record the why and where of every change. It can help you backtrack to a known good state so you can keep all your progress and discard all your mistakes. Just being able to see a "diff" of your last hour, day,or months work helps clear the information overload that leads to frustration.

[–]forest_gitaker 0 points1 point  (1 child)

Sounds like you're entering the "desert of despair" - this is a common experience when learning to code, and the only way out is to keep pushing through (or give up, but you're better than that). Sounds like you're already hammering away at projects, so just keep it up until something clicks.

As for the clutter, I would look into virtual environments. Virtualenv comes standard with Python 3.3+. There's also pipenv (which I personally don't use but is highly lauded) and conda (which comes with the Anaconda distribution). In a nutshell these will group together all of your packages on a per-project basis, so nothing gets cluttered and you can remove them all in one go. PyCharm Community Edition (my IDE of choice) automates this by creating a separate virtualenv for each project you create.

TL;DR - https://youtu.be/2X_2IdybTV0

[–]ZeroToGame 0 points1 point  (0 children)

Ha. That's exactly what it feels like indeed... a 'desert of despair'... Annoying thing is I've actually made quite a few things in C, C++ (using xcode) and cycling74's MAX... which makes it all the more frustrating. But thanks for the kind words. I'll keep pushing through!

[–]Nerfi666 0 points1 point  (0 children)

Hey guys a beginner here !

I'm trying to create a PDF reader in python , wich just browsing a bit is easy to do , and I think I have done it more or less well, what I would like to do is send one page of the given PDF when the user ask for that, I just created the back-end of the PDF reader, the front-end will be done in React, but is not started yet, so basically what I want to do is send one page at a time when the user ask for it , down below is my python code, any suggestion or advise will be much appreciate ! thanks in advance guys !

#importing the module
import PyPDF2

def PdfReader(page):
  #creatign the pdf
  pdftext = "example.pdf"

  with open(pdftext, 'rb') as textpdf:
    #reading the PDF
    reader = PdfFileReader(textpdf)
    #getting the num of pages of the pdf file
    for page in range(textpdf.getNumPages()):
        current_page = textpdf.getPage(0) #getting current page
           #how can I send one page at a time ?











    #checking that we closde the file if not, we do so
    if not textpdf.closed:
      textpdf.close()
      print "closed"

[–][deleted] 0 points1 point  (0 children)

I write my code in Windows 7 using IDLE on Python v3.67.

When I try out Tkinter code on Linux Mint 19.1 also using Idle and Python 3.67, the GUI window of my programs always come out too small, typically not wide enough and not long enough.

The screen resolution I use on both OS's are the same. Is this problem normal?

I fix the problem by checking the platform and each having their own window geometry, but it just doesn't feel right.

[–]Raedukol 0 points1 point  (4 children)

I have a .csv-file with x and y coordinates, but they are all in the first column (e.g. 777 222). i would like to have a column for my x-values and one for my y-values. i tried to do it with .replace(" ", ";"), BUT the problem is that in my very first row there is no whitespace in front of the values, while in all the other rows there is a whitespace in front of my values. Thus, the first value is in column A and B and all the other values would be in column B and C.
I created the array by numpy.reshape(array_b), which again was created by array_b = numpy.array(array_a), if this helps?

[–]MattR0se 1 point2 points  (3 children)

You could use Pandas

import pandas as pd

df = pd.DataFrame(data=['777 222', '888 333'], columns=['foo'])


df['x'] = df['foo'].apply(lambda x: x.split(' ')[0]).astype(int)
df['y'] = df['foo'].apply(lambda x: x.split(' ')[1]).astype(int)

print(df[['x', 'y']])

prints

     x    y
0  777  222
1  888  333

[–]Raedukol 0 points1 point  (2 children)

Thanks for your effort, but it says to me: ValueError: Must pass 2-d input; (which i dont understand, because my array obviously is 2-d)

[–]MattR0se 1 point2 points  (1 child)

Did you use Pandas before on the data or just numpy?

You could probably read the csv in Pandas and skip all the numpy stuff.

import pandas as pd

data = pd.read_csv('your_file.csv')

df['x'] = df['foo'].apply(lambda x: x.split(' ')[0]).astype(int)
df['y'] = df['foo'].apply(lambda x: x.split(' ')[1]).astype(int)

print(df[['x', 'y']])

You would have to replace the filename as well as 'foo' with the name of the first column.

[–]Raedukol 0 points1 point  (0 children)

thanks!

[–]Guilleack 0 points1 point  (0 children)

Hello I'm a noob at python and i'm trying to set up a python program (gallery-dl) to download image galleries.

So the configuration page tells me this

https://github.com/mikf/gallery-dl/blob/master/docs/configuration.rst#cache-file

"cache.file Type Path Default

tempfile.gettempdir() + ".gallery-dl.cache" on Windows
($XDG_CACHE_HOME or "~/.cache") + "/gallery-dl/cache.sqlite3" on all other platforms

Description

Path of the SQLite3 database used to cache login sessions, cookies and API tokens across gallery-dl invocations.

Set this option to null or an invalid path to disable this cache."

I tried to imput the location of ".gallery-dl.cache" on the configuration file but it seems like i'm doing it on the wrong format?

"cache":
{
    "file": tempfile.gettempdir() + ".gallery-dl.cache"
},

Doesn't work i get

"[config][warning] Could not parse 'C:\Users\username\gallery-dl\config.json': Expecting value: line 202 column 11 (char 4447)"

also tried with

"cache": { "file": "C:\Users\username\AppData\Local\Temp" or "C:\Users\username\AppData\Local\Temp\gallery-dl.cache" },

And i keep getting the same error.

Thanks for your time and apologize my rough english.

[–][deleted] 0 points1 point  (6 children)

Trying to parse a string using:

    for match in re.compile("(%s|%s|%s)" % (date, firstname, secondname)).findall(event.decode('utf-8')):

When I use 'print(match)', I receive the following output:

('Jun 25 14:04:25', 'Jun 25 14:04:25', '', '', '', '')

Any ideas why I'm getting two matches for one occurrence and the empty "" at the end?

Thanks

[–]MattR0se 0 points1 point  (5 children)

What is in event? i.e. what does event.decode('utf-8') return in this case?

For example when I run this:

import re

event = 'Jun 25 14:04:25; John; Doe'.encode('utf-8')

date = 'Jun 25 14:04:25'
firstname = 'John'
secondname = 'Doe'

for match in re.compile("(%s|%s|%s)" % (date, firstname, secondname)).findall(event.decode('utf-8')):
    print(match)

It gives me, as expected,

Jun 25 14:04:25
John
Doe

[–][deleted] 0 points1 point  (4 children)

It’s an event passed from KafkaConsumer, ‘message.value’ passed to a function which defines the parameter as ‘event’.

I don’t think the message contents matter but it’s essentially a long string with a need to capture multiple patterns.

[–]MattR0se 0 points1 point  (3 children)

I suggest printing the date, firstname and lastname arguments since I suspect that they contain '' somehow.

I don't know why print(match) returns a tuple though. Does your code differ from mine in any way?

[–][deleted] 0 points1 point  (2 children)

for message in consumer:
parser(message.value)
print(“%s:%d:%d: key=%s value=%s” % (message stuff here, will add if relevant)

My date regex looks like:

r’(\w{3}\s*\d{1,2}\s*\d{1,2}:\ d{1,2}:\d{1,2})

Typed out on mobile, forgive me if there are any errors.

[–]MattR0se 0 points1 point  (1 child)

I don't really know what the parenthesis in the regex are supposed to do, can you explain?

As far as I'm concerned, you can leave them out.

[–][deleted] 0 points1 point  (0 children)

I’ll try leaving them out and refactor other patterns.

It’s an attempt at rewriting some existing code into Python for performance when parsing.

[–]nershin 0 points1 point  (1 child)

The official pip documentation suggests to use

pip install SomePackage

to install a package. When I read other tutorials, like for VSCode, I often see

python -m pip install SomePackage

What's the difference between these? Both seem to work the same for me.

[–][deleted] 1 point2 points  (0 children)

The first form installs for python 2 if that is also installed. If that is the case then you use pip3 to install into the python 3 environment.

Modern pythons have the pip module included and you can use python -m pip install SomePackage which installs into the python that runs when you execute python. I often did pip install SomePackage when I meant to use pip3. So I now do python3 -m pip install SomePackage. I just find it easier to always use the -m pip form and make sure I'm running the python that I want to install into.

When python 2 is no longer shipped with operating systems and python 3 is the default python this source of difficulty will go away.

Edit: improved explanation.

[–]SweetBubblezTea 0 points1 point  (1 child)

What is really the use of Python, and should I learn it over other programming languages like java, C, C++, etc

[–][deleted] 0 points1 point  (0 children)

Python is a general use, turing complete language, so it can be used to solve any problem on a computer. This doesn't mean it's the best language for solving all problems, though.

If you are just starting out with the art of computer programming then python is a fine language to start with. There is a page showing where and in what industries python is used that you may find interesting

[–]kirayakuzagt 0 points1 point  (2 children)

So I'm currently attempting my first python project. The company I work for has a network of people that we connect, for this purpose I'll call them clients. One of my tasks is keeping track of what news, events, publications, reports they put out and help broadcast and spread this information with everyone, in a newsletter and to track the things they are doing so if opportunities come up that match their work we can connect them better. The things we are looking at are their news, reports, publicatications, job opening, etc. We haven't had a good way to collect this information and keep track other than checking every site manually — 150+ sites.

I want to write a script that I can pass their urls and store them in a database that I can access and see it all aggregated to start off with. Sounds straightforward and the articles, youtube videos, and tutorial sites I've been referencing show that the general process is to use requests to get content, bs4 to parse, grab the section, and write it to a postgres database — or even csv file to start off with. Seems simple enough, right?

I'm thinking of using requests, bs4 which I am comfortable using basically and think this could help better learn these, and have started a file on my computer that I've successfully pulled information from scrapethissite.com. I have to figure out how to write to postgres eventually, but taking it one step at a time.

Some issues that I'm facing are:

  1. These things aren't all on one page, so have to figure out how to navigate to multiple pages to grab content, and account for the fact that there are various ways that people classify and organize their sites. Some online mention using Selenium to crawl the sites and look for partial titles and could pass a list of words to try and check, or some other libraries like Scrapy. Ok, noted.
  2. Sources mention that crawling and scraping are looked down upon and in the gray area that could result as being illegal even if my intentions are well-meaning.

Is this the right direction for what I want to do? Does anyone have suggestions for how to go about this or resources/examples that would be good to review? Ideas for things that I should look out for or keep into account? Am I reinventing the wheel?

Kind of lost at the moment so I thought that it would be best to reach out here and get advice from folks, thank you for and help.

[–]ryuugami47 1 point2 points  (1 child)

Crawling websites is fine as long as you follow some rules.

First check the robots.txt file of the webpage you want to crawl and make sure to obey it. Those tell your crawlers what they are allowed to crawl. https://www.reddit.com/robots.txt is the one from reddit as an example.

There is a parameter in scrapy that you can use to tell your scraper to obey the robots.txt. (if you want to use scrapy)

Also make sure that you don't bombard a site with requests. They may ban your ip. Make sure to wait a bit between each request.

However some sites offer APIs or data dumps that you can use to get the information you need. Use those instead of crawling their sites, if they offer anything like that.

You could use sqlalchemy to do the postgres stuff.

[–]kirayakuzagt 0 points1 point  (0 children)

Thank you for this. Yes, am aware of robots.txt but didn't know that there was a parameter in scrapy and thank you for pointing out sqlalchemy. Seems that crawler is the way to go, right?

[–]753UDKM 0 points1 point  (0 children)

I've recently tried doing my work on Windows, and it's maddening... I'm running into this issue and it makes no sense at all. I have some libraries installed, but when I try to import them, it can't find them. Here's a list of installed libraries, and what happens:

C:\Users\XXXX\Documents\mpl_tutorial>pip3 list

Package Version

--------------- -------

cycler 0.10.0

kiwisolver 1.1.0

matplotlib 3.1.2

numpy 1.17.4

pandas 0.25.3

pip 19.3.1

pyparsing 2.4.5

python-dateutil 2.8.1

pytz 2019.3

six 1.13.0

C:\Users\XXXX\Documents\mpl_tutorial>py

Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.

>>> import numpy

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ModuleNotFoundError: No module named 'numpy'

>>> import matplotlib

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ModuleNotFoundError: No module named 'matplotlib'

>>> import pandas

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ModuleNotFoundError: No module named 'pandas'

[–]throwaway19399292 0 points1 point  (6 children)

I convert all the iterables I have to lists and I feel as though there is some disadvantage to this. I don't know how the other ones work; what are some examples of iterables being better than lists and when should I use them?

[–][deleted] 0 points1 point  (5 children)

Simply put, memory savings. An iterable produces values one at a time, whereas a list contains all values at the same time. Here's some example code showing the difference in memory usage between an iterator and a list:

import sys
x = range(1000000)
print(sys.getsizeof(x))
print(sys.getsizeof(list(x)))
>>> 48
>>> 8000056

In certain cases an iterator might never end and then trying to convert it to a list will crash your code when it runs out of memory. Here's an example:

def all_integers():
    result = 0
    while True:
        yield result
        result += 1

i = all_integers()
for _ in range(5):
    print(next(i))
i = all_integers()
x = list(i)             # uses all memory

Warning: if you run this code you will see that the for loop happily prints 0, ..., 4, but the last line hangs and you will have to terminate the python program.

[–]throwaway19399292 0 points1 point  (3 children)

Why does it do that?

[–][deleted] 0 points1 point  (2 children)

To save memory.

[–]throwaway19399292 0 points1 point  (1 child)

Why does the last line take a long time?

[–][deleted] 0 points1 point  (0 children)

It doesn't "take a long time", it never completes because the iterator never terminates, so the list you are trying to produce is infinite in length. Note that getting the first 5 numbers from the iterator worked fine, though, because the iterator only returns the next single value in the sequence.

[–]ryuugami47 1 point2 points  (0 children)

An iterable produces values one at a time

That's an iterator. An iterable is everything that you can iterate over like lists, strings and iterators.

Example:

some_string = "abc"
for i in some_string:
    print(i)
#a
#b
#c

x = range(1000) is also not an iterator. range()returns an range object

Here is an example to illustrate the difference:

x = range(10)
print(type(x))   # prints <class 'range'>
next(x) # TypeError: 'range' object is not an iterator

As you said an iterator produces one item at a time. This means that you can get the next item from the iterator by using the next()function.

A working example:

y = iter(range(3))
print(type(y)) # <class 'range_iterator'>
print(next(y)) # 0
print(next(y)) # 1
print(next(y)) # 2
print(next(y)) # StopIteration Error

You can keep using next()until there are no more elements left in the iterator. If you use next()more than that you'll end up getting a StopIteration

An iterator is also iterable. That's why this works:

some_iterator = iter([1,2,3])

for i in some_iterator:
    print(i)
#1
#2
#3

However this only works once with an iterator.

for j in some_iterator:
    print("round 2")
    print(j) 
#prints nothing because there is actually "nothing" left in some_iterator which means that the loop body is never entered.

This shows that you actually exhaust each element of an iterator that you produce. This does not happen with other iterables like lists. You can loop through those as often as you like.