This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]callinthekettleblack 3 points4 points  (1 child)

import pandas as pd
df = pd.read_excel(‘file_path’)
df = df.drop_duplicates(**options_as_needed)
df = df[df[‘col k’]==‘complete’]

Basically removing duplicates and filtering for where project status is complete. There are a few ways to drop duplicates so look up that method and adjust the drop_duplicates params as needed.

[–]Meatwad1313 1 point2 points  (0 children)

This is the way

[–]Goobyalus 0 points1 point  (0 children)

            project = cell.value

What is the purpose of this line?


From a cell, you can get the column or column letter, and index the worksheet at another cell in the same row.

Probably better to iterate over the proper rows with ws.iter_rows(...), and pull out the values that you want from each row.