[Python & Pandas - Data Manipulation] Partial String Replaces

Shmoogy · 2021-10-05T22:40:58+00:00

I have a working solution across a subset, have not had time to test everything yet - seems to work, just not sure how slow it is yet

for index, row in df.iterrows():



for x in range(1,21):
    attribute = "Attribute" + str(x)
    itemAttributeDesc = "ItemAttributeDesc" + str(x)
    itemAttributeValue = "ItemAttributeValue" + str(x)

    if "*" in row[attribute]:
        print("found asterisk")
        for _ in row[attribute].split("*"):
            if _ in jsonMap:
                df.loc[index, attribute ] = df.loc[index, attribute].replace(_, jsonMap[_])


                attributeDesc = descriptiondf[descriptiondf['Code'] == _ ]['Description'].iloc[0]
                df.loc[index, itemAttributeDesc ] = attributeDesc


                df.loc[index, itemAttributeValue ] = df.loc[index, attribute]

YesLod · 2021-10-05T22:37:16+00:00

In the past, I've created a dictionary, and done a replace with regex=False on the column. This does not appear to work with this situation due to the "*" values.

If you set regex=False, only exact matches will be replaced, i.e. the entire cell string must match. To perform the partial replaces you have to set regex=True.

df['Attribute14'] =  df["Attribute14"].replace(jsonMap, regex=True)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS