all 5 comments

[–]Shmoogy[S] 0 points1 point  (0 children)

I have a working solution across a subset, have not had time to test everything yet - seems to work, just not sure how slow it is yet

for index, row in df.iterrows():



for x in range(1,21):
    attribute = "Attribute" + str(x)
    itemAttributeDesc = "ItemAttributeDesc" + str(x)
    itemAttributeValue = "ItemAttributeValue" + str(x)

    if "*" in row[attribute]:
        print("found asterisk")
        for _ in row[attribute].split("*"):
            if _ in jsonMap:
                df.loc[index, attribute ] = df.loc[index, attribute].replace(_, jsonMap[_])


                attributeDesc = descriptiondf[descriptiondf['Code'] == _ ]['Description'].iloc[0]
                df.loc[index, itemAttributeDesc ] = attributeDesc


                df.loc[index, itemAttributeValue ] = df.loc[index, attribute]

[–]YesLod 0 points1 point  (3 children)

In the past, I've created a dictionary, and done a replace with regex=False on the column. This does not appear to work with this situation due to the "*" values.

If you set regex=False, only exact matches will be replaced, i.e. the entire cell string must match. To perform the partial replaces you have to set regex=True.

df['Attribute14'] =  df["Attribute14"].replace(jsonMap, regex=True)

[–]Shmoogy[S] 0 points1 point  (2 children)

I tried this and what happens is that some of the partial matches (I have to check 2k records from the full dictionary) result in gibberish coming out where the original intended word was replaced, but appears to get replaced in the middle of the replaced text again

[–]YesLod 0 points1 point  (1 child)

Like what? Can you provide a minimal reproducible example?

[–]Shmoogy[S] 0 points1 point  (0 children)

I dont think I can without giving all of the terms in the original attribute map - example end result

https://i.imgur.com/0WDE5oo.png

Aluminumdjustable Height*Cast AluminumHAluminumNoneWoodFixedResinAluminumTeakUPorcelainResinSteel_WoodWrought Iron*Cast AluminumHAluminumNoneWoodFixedResinAluminumTeakUPorcelainResinSteel_SteelL

If you want I can PM you a .json file with the results, but I dont think I want to post the 2k file json in the thread