Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Fs_extra | 166 | 211 | 352 | 2 months ago | 9 | February 03, 2023 | 34 | mit | Rust | |
Expanding opportunities standard library std::fs and std::io | ||||||||||
Jsonfs | 15 | 7 years ago | 1 | September 12, 2016 | Go | |||||
Serve a JSON file as a file system over 9P | ||||||||||
Node S3 Proxy | 10 | 10 years ago | JavaScript | |||||||
an example of streaming multipart uploads directly to s3, displaying a real-time progress bar | ||||||||||
Dtype Fs | 7 | 3 years ago | 3 | gpl-3.0 | Vue | |||||
Filesystem app based on dType | ||||||||||
Hardlinkable | 4 | 4 years ago | 1 | mit | Go | |||||
A tool to scan directories and report on the space that could be saved by hardlinking identical files. It can also perform the linking. Written in Go. | ||||||||||
Go Finder | 4 | 9 years ago | Go | |||||||
Find files in your filesystem with Go. Work in progress, I'm just testing the language | ||||||||||
Ceph Sync | 2 | 2 years ago | JavaScript | |||||||
Sync tool between LOCAL file system and REMOTE object storage. | ||||||||||
Iterfilesystem | 2 | 3 years ago | 1 | other | Python | |||||
Multiprocess directory iteration via os.scandir() with progress indicator via tqdm bars. | ||||||||||
Pydfu | 2 | 6 years ago | mit | Python | ||||||
Python Interface Disk and Filesystem Utils | ||||||||||
Convertvideos Electron | 1 | a year ago | 17 | JavaScript | ||||||
This is a project written in electron to select video(s) from the file system, convert them to a different video format, view the converted file by clicking on a button within the application, view the progress of the video conversion on the application, and add more videos while a video is converting. |
Multiprocess directory iteration via os.scandir()
Who's this Lib for?
You want to process a large number of files and/or a few very big files and give feedback to the user on how long it will take.
The main process starts statistic processes in background via Python multiprocess and starts directly with the work.
There are two background statistic processes collects information for the process bars:
Why two processes?
Because collect only the count of all filesystem items via os.scandir() is very fast. This is the fastest way to predict a processing time.
Use os.DirEntry.stat() to get the file size is significantly slower: It requires another system call.
OK, but why two processed?
Use only the total count of all DirEntry may result in bad estimated time Progress indication. It depends on what the actual work is about: When processing the contents of large files, it is good to know how much total data to be processed.
That's why we used two ways: the DirEntry count to forecast a processing time very quickly and the size to improve the predicted time.
Please: try, fork and contribute! ;)
Use example CLI, e.g.:
~$ git clone https://github.com/jedie/IterFilesystem.git ~$ cd IterFilesystem ~/IterFilesystem$ pipenv install ~/IterFilesystem$ pipenv shell (IterFilesystem) ~/IterFilesystem$ print_fs_stats --help (IterFilesystem) ~/IterFilesystem$ pip install -e . ... Successfully installed iterfilesystem ~/IterFilesystem$ $ poetry run print_fs_stats --help usage: print_fs_stats.py [-h] [-v] [--debug] [--path PATH] [--skip_dir_patterns [SKIP_DIR_PATTERNS [SKIP_DIR_PATTERNS ...]]] [--skip_file_patterns [SKIP_FILE_PATTERNS [SKIP_FILE_PATTERNS ...]]] Scan filesystem and print some information optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit --debug enable DEBUG --path PATH The file path that should be scanned e.g.: "~/foobar/" default is "~" --skip_dir_patterns [SKIP_DIR_PATTERNS [SKIP_DIR_PATTERNS ...]] Directory names to exclude from scan. --skip_file_patterns [SKIP_FILE_PATTERNS [SKIP_FILE_PATTERNS ...]] File names to ignore.
example output looks like this:
(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --path ~/IterFilesystem --skip_dir_patterns ".*" "*.egg-info" --skip_file_patterns ".*" Read/process: '~/IterFilesystem'... Skip directory patterns: * .* * *.egg-info Skip file patterns: * .* Filesystem items..:Read/process: '~/IterFilesystem'... ... Filesystem items..: 100%||135/135 13737.14entries/s [00:00<00:00, 13737.14entries/s] File sizes........: 100%||843k/843k [00:00<00:00, 88.5MBytes/s] Average progress..: 100%||00:00<00:00 Current File......:, /home/jens/repos/IterFilesystem/Pipfile Processed 135 filesystem items in 0.02 sec SHA515 hash calculated over all file content: 10f9475b21977f5aea1d4657a0e09ad153a594ab30abc2383bf107dbc60c430928596e368ebefab3e78ede61dcc101cb638a845348fe908786cb8754393439ef File count: 109 Total file size: 843.5 KB 6 directories skipped. 6 files skipped.