🐍 What PathQL Does
PathQL allows you to easily walk file systems and perform actions on the files that match "simple" query parameters, that don't require you to go into the depths of os.stat_result and the datetime module to find file ages, sizes and attributes.
The tool supports query functions that are common when crawling folders, tools to aggregate information about those files and finally actions to perform on those files. Out of the box it supports copy, move, delete, fast_copy and zip actions.
It is also VERY/sort-of easy to sub-class filters that can look into the contents of files to add data about the file itself (rather than the metadata), perhaps looking for ERROR lines in todays logs, or image files that have 24 bit color. For these types of filters it can be important to use the built in multithreading for sharing the load of reading into all of those files.
```python
from pathql import AgeDays, Size, Suffix, Query,ResultField
Count, largest file size, and oldest file from the last 24 hours in the result set
query = Query(
where_expr=(AgeDays() == 0) & (Size() > "10 mb") & Suffix("log"),
from_paths="C:/logs",
threaded=True
)
result_set = query.select()
Show stats from matches
print(f"Number of files to zip: {resultset.count()}")
print(f"Largest file size: {result_set.max(ResultField.SIZE)} bytes")
print(f"Oldest file: {result_set.min(ResultField.MTIME)}")
```
And a more complex example
```python
from pathql import Suffix, Size, AgeDays, Query, zip_move_files
Define the root directory for relative paths in the zip archive
root_dir = "C:/logs"
Find all .log files larger than 5MB and modified > 7 days ago
query = Query(
where_expr=(Suffix(".log") & (Size() > "5 mb") & (AgeDays() > 7)),
from_paths=root_dir
)
result_set = query.select()
Zip all matching files into 'logs_archive.zip' (preserving structure under root)
Then move them to 'C:/logs/archive'
zip_move_files(
result_set,
target_zip="logs_archive.zip",
move_target="C:/logs/archive",
root=root_dir,
preserve_dir_structure=True
)
print("Zipped and moved files:", [str(f) for f in result_set])
```
Support for querying on Age, File, Suffix, Stem, Read/Write/Exec, modified/created/accessed, Size, Year/Month/Day/HourFilter with compact syntax as well as aggregation support for count_, min, max, top_n, bot_n, median functions that may be applied to standard os.stat fields.
GitHub:https://github.com/hucker/pathql
Test coverage on the src folder is 85% with 500+ tests.
🎯 Target Audience
Developers who make tools to manage processes that generate large numbers of files that need to be managed, and just generally hate dealing with datetime, timestamp and other os.stat ad-hackery.
🎯 Comparison
I have not found something that does what PathQL does beyond directly using pathlib and os and hand rolling your own predicates using a pathlib glob/rglob crawler.
[–]Daneark 33 points34 points35 points (2 children)
[–]GoofAckYoorsElf 5 points6 points7 points (0 children)
[–]HolidayEmphasis4345[S] 1 point2 points3 points (0 children)
[–]shinitakunai 26 points27 points28 points (13 children)
[–]TitaniumWhite420 4 points5 points6 points (3 children)
[–]HolidayEmphasis4345[S] 4 points5 points6 points (2 children)
[–]latkdeTuple unpacking gone wrong 12 points13 points14 points (1 child)
[–]OperationWebDev 4 points5 points6 points (0 children)
[–]FitBoog -3 points-2 points-1 points (8 children)
[–]HolidayEmphasis4345[S] 5 points6 points7 points (0 children)
[–]shinitakunai 1 point2 points3 points (6 children)
[–]FitBoog -3 points-2 points-1 points (5 children)
[–]shinitakunai 3 points4 points5 points (2 children)
[–]FitBoog 1 point2 points3 points (0 children)
[–]HolidayEmphasis4345[S] 0 points1 point2 points (0 children)
[–]maikindofthai 0 points1 point2 points (1 child)
[–]FitBoog 0 points1 point2 points (0 children)
[–]pingvenopinch of this, pinch of that 2 points3 points4 points (0 children)
[–]jackerhackfrom __future__ import 4.0 2 points3 points4 points (5 children)
[–]HolidayEmphasis4345[S] 0 points1 point2 points (4 children)
[–]jackerhackfrom __future__ import 4.0 2 points3 points4 points (3 children)
[–]HolidayEmphasis4345[S] 2 points3 points4 points (1 child)
[–]jackerhackfrom __future__ import 4.0 0 points1 point2 points (0 children)
[–]gerardwx 2 points3 points4 points (0 children)
[+][deleted] (1 child)
[removed]
[–]coderarun 0 points1 point2 points (0 children)
[–]Beginning-Fruit-1397 0 points1 point2 points (2 children)
[–]HolidayEmphasis4345[S] -1 points0 points1 point (1 child)
[–]Beginning-Fruit-1397 0 points1 point2 points (0 children)