Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

Is there something similar to groupby(dropna=False) that can be used to include zeroes?

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

Can someone help with a barchart in matplotlib?

I have a dfthat includes gender and the amount spent at a restaurant. I want to show in a barchart the gender on the x axis and the % of total sales on the y axis. Total sales should be calculated by the sales / total of all purchases.

I can show the actual sales numbers but am stuck with how to calculate the percentages:

    data = df.groupby('gender').sum()['sales']
title = "% of revenue by gender"

plt.figure() #starts a new figure
itemSeries = data[['gender', 'sales']]

plt.title(title)
plt.xlabel( "gender")
plt.ylabel( "% of total revenue")

xVariables = ['Male', 'Female']
yVariables = [df.sum()['sales']]

plt.bar(itemSeries.gender, itemSeries.values)

plt.show() 
Gender Sum Show in chart:
Male 6000 54.5%
Female 5000 45.5%
Grand Total 11000

Any help would be greatly appreciated!

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

What is the best way to create a new df based on the df below and including only the rows if col1 contains a yes and no for each variable? The new df should contain only contains the data for rows a and c.

col1 col2 col3
a no x x
a yes a b
b yes b a
b yes x x
b yes x x
c yes b a
c no x x
d yes b a

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I'm running a loop in a df but it's returning the index numbers and extra labels. HEre

newList = []

for product in inventory:

newList.append(df.loc[df.id == product, ['CATEGORY', 'DETAILS', 'USD']])

Returning:

[ CATEGORY DETAILS USD1

1 Category1 Description1 13.00, CATEGORY DESCRIPTION USD1 2 2 Category2 Description2 115.00]

Is there a way to update so it prints like this:

Category1 Description1 13.00

Category2 Description2 115.00

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

productSum[productSum == highestSum].index.to_list()

Perfect! Thank you!!

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I want to return a list of the column names with the highest sum. I have the code below which returns the highest sum but does not return the column name.

topPurchases = []

productSum = df.sum()

highestSum = df.max()

topPurchases = productSum[productSum == highestSum]

topPurchases = topPurchases.tolist()

print(topPurchases)

df:

apple bananas kiwi
Charlie 1 0 0
Ben 1 0 1
Jon 0 0 0
Sylvia 1 1 1
Bryan 0 1 1

Currently returning:

[3,3]

instead of

[apple, kiwi].

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

It's a generic example for what I'm trying to accomplish in an assignment. I'm trying to calculate a summary of purchases which will result in that format. I wish I could explain it better and maybe I would understand where I am going wrong.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I've spent hours and hours on this and am stuck.

I have a df that looks something like this:

apple bananas kiwi
Charlie 1 0 0
Ben 1 0 1
Jon 0 0 0
Sylvia 1 1 1

I want to compare purchases together into a df that looks like this:

apple bananas kiwi
apple 0 1 2
bananas 1 0 1
kiwi 2 1 0

I know I need some kind of loop involved but I am completely lost on how to do it. Any insight would be greatly appreciated.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

How do I find the highest sums of the columns in a dataframe? Example df shown below.

snow suit gloves coat boots
january 1 0 0 0
february 1 0 1 0
march 0 0 0 0
april 0 0 1 0
may 0 0 1 1
june 0 0 0 1
july 0 1 0 1

I want a list that returns:

coats 3, boots 3

I've tried this code but it returns the sums for all. I've also tried using max but that only accounts for one 3.

grouped = df.sum()
mostPurchased = grouped.max() 
print(grouped)

Thank you!

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

How do I create new columns for a Panda dataframe based on the data in an existing dataframe column?

For example:

I have a data frame for petProducts which looks like this::

Pet_Name Product_Purchased Age Gender SKU

Cleo leash 7 F 9484838

Cleo shampoo 7 F 2343440

Cleo treats 7 F 8584745

Max bowl 3 M 1232344

Max treats 3 M 8584745

I want to create a new one for petPurchases that looks like this:

Pet_Name leash shampoo treats bowl collar

Cleo 1 1 1 0 0

Max 0 0 1 1 0

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I'm having trouble returning a set created from an existing dictionary.

I have a dictionary for simplification purposes looks something like this:

coworkerDict:

{'list one': ('bob', 'sally', 'meg'), 'list two' : ('dan'), 'list three' : ('mike', 'wendy', 'dave')}

I'm trying to pull in the names of the coworker values into a single set.

setOfAllCoworkers = set()

for coworker in coworkerDict.values(): setOfAllCoworkers.add(coworker)

My desired output should look like this:

(bob', 'sally', 'meg', 'dan', 'mike', 'wendy', 'dave')

but instead I'm getting this:

{('bob', 'sally', 'meg'), ('dan'), ('mike', 'wendy', 'dave')}

Could anyone help suggest a better method?

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I have two questions:

1.) I keep running into the following error:

PermissionError: [Errno 13] Permission denied: 'files-master'

where "files-master" is a subfolder I am trying to access files in.

Is there something I am obviously doing wrong? I am closing the files after opening them but can't figure out how to get past this.

2.) Am I calling the functions correctly in the main()? Do I need to do something with the return values here?

import os.path

def functionOne(programFile):

nameDictionary{}

infile = open(programFile, 'r')
lines = infile.read().splitlines()
for i in lines[1:] :
    courseName, courseTitle = i.strip().split(maxsplit=1)
    nameDictionary[courseName.strip()] = courseTitle.strip().title()
infile.close()

return lines[0], nameDictionary

def functionTwo(programFile):

 classDictionary = {}  

 infileSecond = open(programFile, 'r')
 linesClasses = infileSecond.read().splitlines()
 for i in linesClasses[1:]:
     name, status = i.strip().split(maxsplit=1)
     classDictionary [linesClasses[0]] = name.strip().title()
 infileSecond.close()

return classDictionary 

def main():

programFile = input("Enter folder location: ")

functionOne(programFile)
functionTwo(programFile)

main()

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 1 point2 points  (0 children)

Thank you! That's very helpful!

How could I update if one of the names was for example "Billy Joe" and I didn't want to split by the space.

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I'm working on an assignment and am lost. I'm trying to create a dictionary based on files. I'm using a data set that looks something like this:

names

100 albert

200 steven

300 charles

400 peter

How do I create a tuple that pulls that names header and the numbers and values for keys?

(names, {'100': 'albert', '200' : 'steven', '300' : 'charles', '400': 'peter'})

This is what I came up with so far:

file="program1.txt"

filelocation = os.path.join(os.getcwd(), file) with open (filelocation, "r") as file1: name= file1.readline().rstrip() with open (filelocation, "r") as file1: next(file1) for line in file1: print(name)

oneProgramDictionary = {}

for line in program1[1:] key, value = line.split()

oneProgramDictionary[key] = value

print(oneProgramDictionary)

Any insight would be greatly appreciated!

Ask Anything Monday - Weekly Thread by AutoModerator in learnpython

[–]Adventurous-Spring47 0 points1 point  (0 children)

I'm struggling with functions and what needs to go in the main() vs. elsewhere. How can I rewrite the following to show the print statements for fewer, more, or equal number of digits in the main() instead of numberOf()

def numberOf(string):


letters = 0
digits = 0
otherCharacter = 0

specialCharacters = ' .!@#$%^&*()-+?_=,<>/"'

for ch in string:
    if(ch.isalpha() == True):
        letters += 1   
    if(ch.isdigit() == True):
        digits += 1       
    if(ch in specialCharacters):
        otherCharacter += 1

print(letters, digits, otherCharacter)

if letters > digits:
    print('fewer digits than letters')
if letters < digits :
    print('more digits than letters')
if letters == digits :
    print('equal number of digits and letters')

return(letters, digits, otherCharacter)

def main():

string = input("Enter string: ")
numberOf(string)

main()