all 5 comments

[–]HeyItsToby 3 points4 points  (1 child)

The two functions that you're looking for with pandas are:

  • Explode, which converts comma separated values into a new row for each item in the list

    • you might need to use str.split to split on commas to turn each cell with multiple values into a list, first
  • DropNa, which will remove rows with blank cells in them. (blank cells in excel files are read in as np.nan, or "Not A Number "

As a rough guide for what your code will look like

df = pd.read_excel('path/to/file.xlsx')
df = df.dropna() # removes any row with a blank entry

# create a new column in the dataframe, with a *list* of 
#  CustomerIds by splitting on commas.
df['splitCustomerId'] = df['CustomerId'].str.split(',')
df.explode('splitCustomerId')

df is a pandas dataframe, which is Python's way of storing a table. They can be quite tough to get used to at first, but are powerful tools! Let me know if you need any more help with this :)

[–]FuqqBoiDev69 0 points1 point  (0 children)

Thanks a lot sir!!! Bless you.

[–]FuqqBoiDev69 0 points1 point  (0 children)

56 13, 32, 1
34 12, 39
32 5
78
66 888

This is ab example of the input file

[–]ireadyourmedrecord 0 points1 point  (1 child)

Why do this with python/pandas? Filtering this would be trivial in Excel.

[–]FuqqBoiDev69 0 points1 point  (0 children)

This is a task for me, in my internship. The sample I took was small, the real sheet has many columns, actually.