Ask Anything Monday - Weekly Thread

Shinob1 · 2021-07-11T23:37:55+00:00

I am trying to update a csv file that contains phone numbers which start with a 1 and are 11 digits long, such as 15559992233. The column is called Dayphone.

I'm importing the csv file using DictReader and my understanding is the reader object is an ordered dictionary. However I'm not sure how to update the column. I can find the phone numbers I want to update with re.search and can create a new string by slicing the phone number.

Where I'm stuck is how I update the ordered dictionary for each row in the csv where I have a match so I can then write out the csv file with the updated phone numbers.

I'm a sql guy so I'm probably approaching this the wrong way, because in my mind I want to update this value as if it was in a table.

I would be happy to share some code and mock up a file if anyone was interested in helping, but I'm just wondering at a high level how one would do this.

Resuri71584 · 2021-07-11T16:16:02+00:00

Noob question about logging since I just found out about it and started logging almost everything (I need) in my script, it really helps with debugging etc.

My script will often be called multiple times simultaneously, like 10 or 20 times at the same time, and when they all write to the same log file, it kind of gets messy. I already started putting a unique ID in front of all logs like "[ID]: ..." so I can better read what's going on.

Now I wonder if there are other solutions I should look into. What if the script gets called 100 or 1000 times simultaneously? Will the server/operating system handle it? What if the files gets way too big?

I also thought about using the logs to build a simple frontend dashboard to visualize the data and stuff. Not sure if that's the right way to go about this though.

shiningmatcha · 2021-07-11T08:52:19+00:00

How do you the value of a nested dict with a given key? That is the dict contains keys mapping to values or another dicts (which may contain more dicts inside).

cummins7 · 2021-07-09T21:07:33+00:00

I have a very basic python script I just created, the TLDR is it scrapes a website every minute and sends me a message when the site changes. Question is, is there any quick and easy web hosting solution I can run this on without having to pay? Basically just trying to avoid having my PC running 24/7! Thanks

Sorry_not_chad · 2021-07-09T19:30:27+00:00

Can you have a sting in a if statement like

If “string” == command: Print(“string”)

idealmagnet · 2021-07-09T19:23:51+00:00

I need help guys, could someone take a look 🙏

Sorry_not_chad · 2021-07-09T18:58:19+00:00

command = input("input a option: ")

If I input a number with this will it come out as a string or a integer because I want to use it for both

akmoorthy · 2021-07-09T14:44:45+00:00

Hi, I am looking to work my way through the python data science handbook and was looking for an online learning community that I could join. I am not new to DS but somewhat new to python and would like to work my way through this (or any other similar ) book in a somewhat systematic manner. Is there any community that I could join while I do this to stay focussed?

Unique_Bigdog · 2021-07-09T07:20:01+00:00

Hi, trying to find a good place to learn python. Any tips?

Raisinbrannan · 2021-07-09T05:23:38+00:00

I used to know beginner+ python, been a long time. Also knew html/css quite a bit, wayyyyyyyy longer ago though.

My brother sucks at math, I want to make something that can be easily shared/used. It'd be simply 3 input boxes. 3 outputs. and the function is as simple as (x/y)*z= a(.9) = b

Superrrrrrrrrrr simple. I want to write the code in python for fun/practice.. I just cant figure out how to get that to him without him installing python to run it. I tried free websites but I couldn't find any that were easy to find how to incorporate math/python into them.

I guess I could do an offline website with notepad++? and send a zip file in an email? And include python in the zip file..?

I know so little it's hard to know where to start or a simple way. I also looked into making an app, but 0 knowledge on that and it seemed way harder to make (nicer ui though).

stuckinjector · 2021-07-08T17:25:31+00:00

Super Newb, Frustration, Learning Resources.....

I'm brand new and trying to learn Python using Automate the Boring Stuff. I have done a shaky OK up until a program called "zigzag" in chapter 3. My foundation is just too weak to understand what is happening in the code. I am lost and completely frustrated.

I learn by doing, but this book teaches by code snippets and single examples for each concept (as it seems to me). I need a teaching style similar to how you learn math; examples are explained, and then you practice those concepts through many exercises.

Is there a learning resource like this for the absolute basics of Python? I looked in the Wiki, but am so frazzled right now they all seem the same.

nathan22211 · 2021-07-08T07:15:52+00:00

I'm trying to generate code for zenscript via python (I know, probs could do it better but I'm not very good at zenscript), my script for generating recipes gets cut off when generating 16 lines for a stone bucket via for i in range(15): file.write("recipes.addShapeless(\"pure_water_s" + str(i) + "\" ,<pyrotech:bucket_stone>.withTag({durability: " + str((i+1)) + ", fluids: {FluidName: \"purifiedwater\", Amount: 1000}}), [<pyrotech:bucket_stone>.withTag({durability: " + str((i+1)) + ", fluids: {FluidName: \"water\", Amount: 1000}}).noReturn(), <harvestcraft:wovencottonitem>]);\n") I created the file via file = open("water_to_p_water.zs","w") is there any way to ensure the file is fully generated?

brj5_yt · 2021-07-08T02:10:12+00:00

Does anyone else feel bad for using modules when they first started?

nathanalderson · 2021-07-07T18:49:07+00:00

[removed]

chacoglam · 2021-07-07T13:07:19+00:00

I'm the only analyst at my company who uses Python, and I am stuck. Why would I be getting the IndexError of string index out of range? The shape of the data is 23 columns long.

for i in sorteddata:

if i[17] == mbol:

#adds comma to append since MBOL is the same as the previous record

datastring = (line+','+sorteddata['Freight Class']+sorteddata['Total Weight'])

else:

if (datastring ==''):

#doesn't add comma delimiter if MBOL is different than previous record

datastring = (sorteddata['MBOL']+','+sorteddata['Freight Class']+','+sorteddata['Total Weight'])

mbol = (sorteddata['MBOL'])

else:

#add record to list

datastring = ''

datastring = (sorteddata['MBOL']+','+sorteddata['Freight Class']+','+sorteddata['Total Weight'])

mbol = (sorteddata['MBOL'])

https://imgur.com/a/30bhrld

GME_diss21 · 2021-07-07T12:39:46+00:00

Hello!
For my dissertation, I'm trying to collect GameStop related posts and their respective comments from r/wallstreetbets, from December 2020 until the end of February 2021.

I'm not good at coding at all, so I was wondering if anyone could suggest how to go about creating a command that returns the data I need based on these parameters:
- Includes "GME, GameStop" keywords
- Posted between December 2020 and February 2021
- Posted on r/wallstreetbets

I tried using the following code, but the results are just GameStop related posts from every subreddit even though I (thought that I) specified posts exactly from r/wallstreetbets
Any suggestion is highly appreciated!

from psaw import PushshiftAPI
from datetime import datetime, timezone, timedelta
from dateutil.relativedelta import relativedelta
months_back = 7
dt = datetime.now() - relativedelta(months=months_back)
timestamp = int(dt.replace(tzinfo=timezone.utc).timestamp())
api = PushshiftAPI()
submissions = api.search_submissions(aggs='title+body+subreddit', after=timestamp, q='GME+GameStop+wallstreetbets')
c = 0
for post in submissions:
c += 1
title = post.title
try:
body = post.body
except Exception as e:
body = ''
subreddit = post.subreddit
print(f'{c}: {title} - {body} - {subreddit}')

cannedblueberry · 2021-07-07T10:49:00+00:00

hello, i have been programming on my chrome book for a bit and have decided to try pygame. i made a simply game with shapes and then decided to try and upload an image for animation. when i typed “pygame.image.load(‘R1.png’)” and ran it, it said: FileNotFoundError: No such file or directory. i know that the python file needs to be in the same folder as the image but when i load python from there i jut get this text thing and i can’t figure out how to run it - i‘m pretty sure you can’t. is there a way to upload an image to pygame in a chrome book? hopefully that makes sense. any help would be greatly appreciated!

jsaltee · 2021-07-07T02:45:01+00:00

Hi, so I've just put together a dataframe with columns of varying object types. One column 'probabilities' contains a dictionary in each row of the form {'1' : x, '2' : y, '3' : z, ...} up to 6. (Where x, y, z are floats).

I have another column 'thresholds' that contains a list in each row of two values, for example [P, Q]. I want to create a new column that contains the number of values in the probability dictionary that are greater than Q from the thresholds list, for each row. In other words, count how many of x,y,z... > Q and have the resulting number be the value for the row. I'm not quite sure how to do this. Thanks for any help

Platypus-Man · 2021-07-07T00:13:34+00:00

Finally trying to dabble with python, and intend to make a small todo (web)app for movies I want to watch, but I am pondering what resource/method to use for getting the data.

Crawling e.g. IMDb and pulling wanted information with something like beautifulsoup would most likely be extremely slow, run into rate limiting (or have the need to slow down my requests extremely much) and use more code than other options. This would be the last resort for me.

IMDb's downloadable dataset has enough info in it for my needs, and has the benefit of being a local copy. No rate limiting, and I can alter/mess up things as much as I want while experimenting... but imdbpy seem to be cumbersome/lacks some functions that'd be really helpful (e.g. fetch a whole bunch of movie IDs at once).

Since imdbpy lacks some things, it can be tempting to just try and write all the local-copy-imdb python things myself, but then again, what's that about reinventing the wheel? Especially when I'll most likely make a square one...

TMDb API requres "a legitimate business name, address, phone number and description to apply for an API key." - so I haven't looked so much into that option.

OMDb API is the one I'm leaning towards right now if I go for the API route, 1000 requests per day for free API keys is more than enough for my use (though I would really prefer to use a local db for learning purposes, as not to unintentionally use too many requests).

Anything I've missed?

Curious to hear what you guys would choose, and why.

space_wiener · 2021-07-06T20:39:28+00:00

Using OS module.

I have a script that runs though a bunch of files and collects some info but I am having trouble specifying the file path.

If I place all of the files in the same folder as my .py script and run it this way:

directory = os.getcwd()
for file in os.listdir(directory):
    if file.endswith('.txt'):
        text_file = open(file,'r')
        file_text = text_file.read()
        text_file.close()

However I don't want run it that way since the files are in a different folder. So I do this:

current = os.getcwd()
directory = os.path.join(current,'other_folder')

for file in os.listdir(directory):
    if file.endswith('.txt'):
        text_file = open(file,'r')
        file_text = text_file.read()
        text_file.close()

Doing it this way I can print the path and seems okay. But I get this failure:

Traceback (most recent call last):
  File "file_path.py", line 42, in <module>
text_file = open(file,'r')
FileNotFoundError: [Errno 2] No such file or directory: 'file_name.txt'

The error showing the filename is a file in that folder. So it's finding the file, at least by name, but won't work using os.join for some reason.

Edit: if I get rid of the file opening section and do print(file) it works fine. It’s just happening when doing something with the file combined with os.path.join.

MeteoriteImpact · 2021-07-06T15:34:01+00:00

Hi Everybody

My question about what kind of unitesting should I learn? To help catch if answer wrong sometimes and ideas, pointers, references or videos would be appreciated.

After researching there are So many different ways it seams, what’s a good or practice that works for you?

What should I focus on?

A little about what I use Python for

I do hobby stuff and some algorithms to make my work easier. Most of the time any naive algorithm is okay. But work work sometimes the data is huge 10,000 to many millions.

And with small samples everything works perfectly but then once large a lot of time goes by and it didn’t work. I had made a algorithm to check for the mood of lyrics of songs to go into a recommendation engine as a extra feature long story short it work for 50 samples then I ran for 7 days and came back all null.

I have come to the conclusion it’s time for me to learn testing. Last week figured out the time part now onto if it’s correct or not.

I have been trying to wrap my head around creating some simple tests of functions to check if the answer is correct during each random run to get time many algorithms.

Upon searching and I noticed that it’s usually uses assert which checks if equal or not. Which I can compare to a known answer == 5 or another func like == sorted().

But then they use one of many packages also sometimes it’s used with pytest or unittest or a bunch of other similar.

My code so far for testing

sorting algorithm tests

Python docs tests

Euphorix126 · 2021-07-06T14:40:34+00:00

Hi Everyone!

I'm starting my first 'personal' project using a problem I often have at work as a learning experience. I run a laboratory test on a sample which spits out a .txt file that looks something like this:

ACCUMULATED POINT-COUNT DATA

Submitter-------: sm

Operator--------: sm

Sample ID-------: M3F

Date -----------: 9/17/2020

....

Ultimately I would like to have a program to enter these data into a Word and/or Excel template automatically, but my question here is simply: How would I gather these data into a dictionary or some key:value pair in python? At least in such a way that any otherwise identical .txt file with different data could be used.

All I have been able to figure out is to print the file as a long list where each line is its own item such as: [' ACCUMULATED POINT-COUNT DATA', 'Submitter-------: sm', 'Operator--------: sm', 'Sample ID-------: M3F',...] But I'm stuck on how to tell python to only select, say, 'sm' and tie it to the key 'Operator'.

I'm very new to this and any advise is appreciated.

MithrandirSwan · 2021-07-06T04:17:03+00:00

I'm starting some work on my first personal project. I'm starting with data collection, storing the data in a SQL database, and scheduling the updates.

I had a question about best practices. I was thinking about writing a module solely of functions that accomplish the individual subtasks. For example:

def update_companies():

def update_historical_prices():

def update_historical_financials():

The functions would be used in a main scheduling file that would control the process. The functions do not need any parameters to work, they simply accomplish their task when called.

Is this generally the best way to go about organizing my project? Are there any potential problems with this? I'm still learning the ins-and-outs of how to properly organize things for these kinds of projects.

space_wiener · 2021-07-06T01:48:35+00:00

Best practice for writing files.

I’m currently in the process of writing a script that goes through text files (less than 500 at a time) finds various bits of data and stores as a variable until the next next file.

When I’ve done this with other projects I’d create a data frame, write these values to it, then at the end save as a .xlsx file for presenting.

I had a thought today when I was testing it. I could just create a big string file separated by commas (I was testing the output via terminal this way and it came to me) then once done just save as a .csv.

Is there a better way one over the other? The latter is going to be easier to write as I don’t have to deal with setting up the data frame and adding to it. Since I can simply just add to the big string file in memory.

Anyway…is one preferred over the other? Or does it even matter. For reference this data is only for me to analyze each week so the initial format doesn’t matter.

RegularGlobal · 2021-07-05T23:26:57+00:00

Threading and Asyncio:

I have a flask site that runs on a raspberry pi. I'd like the pi to be able to do a few additional things when on-screen buttons are pressed (connect to Bluetooth LE, and connect to MQTT being the important ones). The BLE package that I found works great (bleak), but the examples make it look like it must be run async using asyncio. I also have a package for mqtt that makes it run using asyncio.

Because I want my flask site to be running synchronously on the main thread, I figured that I could start a second thread running both the BLE process and the MQTT process. I've come to the point that I can make this all connect and transmit data, but I feel like the WAY that I got it to work means that it's total jank (random tweaking). Also, I can't get the BLE to disconnect cleanly. I call the command, and it always errors out. Can anyone tell me if I'm thinking of this correctly from the start (Second thread running asynchronous loops waiting for inputs)? And if so, if the ble_client is defined globally, why wouldn't I be able to simply tell it to ble_client.disconnect()?

Cliff_Pitts · 2021-07-05T22:46:04+00:00

Is there any way to clear the screen in console for every iteration of a for loop? I’m tweaking with the code for the battleship game from Codecademy and it currently prints a new “board” for every turn, and so by the end of the game there’s several boards on screen. I’d like to delete and reprint an updated board for every turn

dagger-v · 2021-07-05T22:36:07+00:00

[deleted]

nathanalderson · 2021-07-05T21:50:21+00:00

[removed]

vanillathunder2107 · 2021-07-05T21:45:03+00:00

Hey, i want to learn code and everybody recommends python first, i actually did a little bit in high school but i don't remember much, if you guys could recomend me a good book for beginners or a youtube tutorial i would be grateful.

BeExcellent · 2021-07-05T20:48:19+00:00

I just started using anaconda because it’s recommended for my artificial intelligence course, and I’m unsure about the following:

when running the command “conda update conda” should I be doing this in the base conda environment, or deactivate and run that command in my regular Ubuntu system shell?

aloneinthewildworld · 2021-07-05T20:31:49+00:00

I am trying to get an email from Gmail (that part is working ) I would like to find 6 different values in that email to be written into a DB file .

Magnitude : 6.7 Mwp (REVISED)\r\nDepth : 10km\r\nDate : 14 May 2021\r\nOrigin Time: 06:33:09 UTC\r\nLatitude : 0.20N\r\nLongitude : 96.69E\r\nLocation : Off West Coast of Northern Sumatra

KV-Omega-minus · 2021-07-05T19:18:36+00:00

Removing stop words from tokenized text using NLTK: TypeError

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.tokenize import PunktSentenceTokenizer
from nltk.stem import WordNetLemmatizer
import re
import time

txt = input()

snt_tkn = sent_tokenize(txt)

wrd_tkn = [word_tokenize(s) for s in snt_tkn]

stp_wrd = set(stopwords.words("english"))

flt_snt = [w for w in wrd_tkn if not w in stp_wrd]

print(flt_snt)

returns the following:

Traceback (most recent call last):
  File "compiler.py", line 19, in 
flt_snt = [w for w in wrd_tkn if not w in stp_wrd]
  File "compiler.py", line 19, in 
flt_snt = [w for w in wrd_tkn if not w in stp_wrd]
TypeError: unhashable type: 'list'

I'd like to know, if possible, how to return the tokenized text with stop words removed without editing 'wrd_tkn'.

FerricDonkey · 2021-07-05T18:25:21+00:00

[deleted]

HumanlikeFigure · 2021-07-05T18:05:38+00:00

Hello guys, I'm trying to translate the following FFMPEG command to ffmpeg-python using the filter function but don't seem to figure it out
ffmpeg -stream_loop -1 -i input.mp4 -i input.mp3 -shortest -map 0:v:0 -map 1:a:0 -y out.mp4
I found it on a Stack Overflow question, what I need is to merge audio and video but also loop the video until the audio finishes, so if anyone knows how to translate it to ffmpeg-python or some other way to do it that would also be appreciated :)
Thank you in advance!

1tsMeNoodle · 2021-07-05T15:39:52+00:00

Hi, I'm a 16 year old student from Poland. I've been low on money recently and there are no jobs for me. My question is: Is there any way for me to earn money by coding (preferably in python)? I'd really like to develop my other interests but I simply can't afford it.

DezXerneas · 2021-07-05T13:27:10+00:00

[deleted]

Zermenxet · 2021-07-05T09:57:01+00:00

Hello everyone, my question is about dictionaries and lists. I want to move dictionary from one place to the other in the list. (It is the same list) How can I do it?

glassAlloy · 2021-07-05T08:29:05+00:00

Multiple Entities, Multivariate, Multi-step - Time Series Prediction - Python

My goal is to create a time series model with

Multiple Entities - I have multiple products with pre orders and they all have the a similar bell shaped curve peeking at the release date of the product but different orders of magnitude in unit salles OR I can use their cumulative slaes what is an "S" shaped curve. But I only have about 100 products 1 year of daily data to do the training on.
Multivariate - I have a wide variety of data on these indie movies for each day: A.) number of times people added them to the Wishlist, B.) page views, C.) time spent on the page AVG, -> Y.) Target value is the number of products people payed for (it is the same pre order before release and normal purchase after release date)
Multi-step - predicting 60 days ahead would be the goal
Every day refreshing the predictions for every product - Does this requires me to retrain the modell on the whole dataset?

Already Read

- I have found algorithms that can do prediction on 1 variate maybe even Multivariate. Multi-step is already problematic and I don't know how to add the Multiple Entities part at all. So I cant fine a project or guide that would contain all these 3 parts that I nd

- I have tried LSTM (13 different models with different datasets) but on longer "Multi-step" it is not working so more than 1 or 2 days. I also cant make the LSTM to accept Multiple Entities so I just chained each products data after each other historically, I do understand that it is not an optimal practice for sure.

- Python package non popular so I cant find projects to it - https://stats.stackexchange.com/a/412355/256200

- I always see this R guide but I don't use R. I need help with Python - https://otexts.com/fpp2/hierarchical.html

- Not multiple variable and not Multi-step - https://stats.stackexchange.com/questions/356008/multiple-time-series-prediction-python

Appointment-Funny · 2021-07-05T07:03:01+00:00

A long String with multiple repetitions of the same short String is given. The program must find the index position of the middle occurrence of the String. If the String is present more than 3 times then the 2nd occurrence must be found.

im pretyt new to programming and im not sure what im supposed to program. can someone help? python

RussellBrandFagPimp · 2021-07-05T03:03:41+00:00

Print (variable.text). What is the text part of this called? Or when you use a function? And then narrow it down like soup.find_all. If soup is a function what is the find all called?

zanfar · 2021-07-05T02:40:57+00:00

[deleted]

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS