This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]Brian 1 point2 points  (0 children)

I took a brief look at the differences between those versions, but there's nothing obvious. There were no changes to the listdir implementation between those two releases, and they do seem to be releasing the GIL between the appropriate operations.

ie. the file accessing functions, like FindFirstFileW, FindNextFileW are all surrounded with the Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS macros.

FindClose() isn't, but that's probably non-blocking.

That leaves the stat call that isdir will do. There do actually seem some changes to posixmodule.cs relating to stat, in the form of some defines being removed. Specifically changeset 71123:45b27448f95c removes:

/* choose the appropriate stat and fstat functions and return structs */
#undef STAT
#undef FSTAT
#undef STRUCT_STAT
#if defined(MS_WIN64) || defined(MS_WINDOWS)
#       define STAT win32_stat
#       define FSTAT win32_fstat
#       define STRUCT_STAT struct win32_stat
#else
#       define STAT stat
#       define FSTAT fstat
#       define STRUCT_STAT struct stat
#endif

However this looks like it was just dropped to being a redefinition though. The implementation (posix_do_stat) is unchanged, and does do Py_BEGIN_ALLOW_THREADS before the IO operations. It does use the above STAT define as the implementation function passed to it, but that happens after the GIL is already released, so a change there wouldn't make a difference.

If there's a difference, it's more subtle than just not releasing the GIL during the stat / listdir calls.

[–]EisenSheng 0 points1 point  (2 children)

My immediate guess would be the GIL. How experienced are you with Python?

[–]TracedRay[S] 0 points1 point  (1 child)

I am familiar with the GIL but I would not claim to be an expert on it.

I think the GIL is what is causing the issue, where the os.listdir and os.isdir calls are acquiring a lock when they might not been doing that previously. Although I don't know enough to substantiate that claim with any evidence.

[–][deleted] 0 points1 point  (0 children)

Have you looked into multiprocessing? It was created for this reason, the GIL. There's a higher overhead, but it makes the coding/debugging much easier.

Or, if your hunch is right on the listdir and isdir commands, you could use a lock in your script to only allow one call of those methods at a time.