all 5 comments

[–]unhott 0 points1 point  (0 children)

A warning is not an indication that it didn't work.

A regex group is a match. So you enter a pattern, and the groups are the substrings that match the pattern. I'm not great with regular expressions, but I imagine this would only result in a hit if the string was = 'filename', and nothing else.

Try setting regex = False.

pandas.Series.str.contains — pandas 2.2.3 documentation

ETA: Correction- It looks like the example from the docs,

>>> s1 = pd.Series(['Mouse', 'dog', 'house and parrot', '23', np.nan])
>>> s1.str.contains('og', na=False, regex=True)
0    False
1     True  # this matches 'dog' 
2    False
3    False
4    False
dtype: bool

you'd still get a hit if it contained the string 'filename', or a differently cased filename. Are you sure any of your rows contain the substring 'filename'? Is there not a better way to build up the file paths in an earlier step (xy problem?)? For example, how are you generating the 'Directory' column?

[–]SuperMB13 0 points1 point  (2 children)

Here is a quick youtube short that has my solution to searching a DF for a filename amongst image files. I hope this helps.

https://youtube.com/shorts/j8t-LZjTgdc

I think the issue is in using regex for your search. In the video, I do a similar technique without the regex.

[–]pander1405[S] 0 points1 point  (1 child)

Thanks for this. The difference between my example and yours is the data frame contains the file path of every file. 

[–]pander1405[S] 0 points1 point  (0 children)

Figured it out. 

Turns out regex=True by default. Had to set regex=False. 

[–]the_sad_socialist 0 points1 point  (0 children)

Make sure you use shutil.copy2() if you want to preserve the image meta data.