Skip to content

Commit edbb97c

Browse files
committed
add compressing/decompressing files tutorial
1 parent eb553a5 commit edbb97c

File tree

10 files changed

+3387
-0
lines changed

10 files changed

+3387
-0
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ This is a repository of all the tutorials of [The Python Code](https://www.thepy
3838
- [How to Make a Screen Recorder in Python](https://www.thepythoncode.com/article/make-screen-recorder-python). ([code](general/screen-recorder))
3939
- [How to Generate and Read QR Code in Python](https://www.thepythoncode.com/article/generate-read-qr-code-python). ([code](general/generating-reading-qrcode))
4040
- [How to Download Files in Python](https://www.thepythoncode.com/article/download-files-python). ([code](general/file-downloader))
41+
- [How to Compress and Decompress Files in Python](https://www.thepythoncode.com/article/compress-decompress-files-tarfile-python). ([code](general/compressing-files))
4142

4243
- ### [Web Scraping](https://www.thepythoncode.com/topic/web-scraping)
4344
- [How to Access Wikipedia in Python](https://www.thepythoncode.com/article/access-wikipedia-python). ([code](web-scraping/wikipedia-extractor))

general/compressing-files/README.md

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# [How to Compress and Decompress Files in Python](https://www.thepythoncode.com/article/compress-decompress-files-tarfile-python)
2+
To run this:
3+
- `pip3 install -r requirements.txt`
4+
-
5+
```
6+
python tar.py --help
7+
```
8+
**Output:**
9+
```
10+
usage: tar.py [-h] [-t TARFILE] [-p PATH] [-f FILES] method
11+
12+
TAR file compression/decompression using GZIP.
13+
14+
positional arguments:
15+
method What to do, either 'compress' or 'decompress'
16+
17+
optional arguments:
18+
-h, --help show this help message and exit
19+
-t TARFILE, --tarfile TARFILE
20+
TAR file to compress/decompress, if it isn't specified
21+
for compression, the new TAR file will be named after
22+
the first file to compress.
23+
-p PATH, --path PATH The folder to compress into, this is only for
24+
decompression. Default is '.' (the current directory)
25+
-f FILES, --files FILES
26+
File(s),Folder(s),Link(s) to compress/decompress
27+
separated by ','.
28+
```
29+
- If you want to compress one or more file(s)/folder(s):
30+
```
31+
python tar.py compress -f test_folder,test.txt
32+
```
33+
This will compress the folder `test_folder` and the file `test.txt` into a single TAR compressed file named: `test_folder.tar.gz`
34+
If you want to name the TAR file yourself, consider using `-t` flag.
35+
- If you want to decompress a TAR file named `test_folder.tar.gz` into a new folder called `extracted` for instance:
36+
```
37+
python tar.py decompress -t test_folder.tar.gz -p extracted
38+
```
39+
A new folder `extracted` will appear that contains everything on `test_folder.tar.gz` decompressed.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
tqdm

general/compressing-files/tar.py

+79
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
import tarfile
2+
from tqdm import tqdm # pip3 install tqdm
3+
4+
5+
def decompress(tar_file, path, members=None):
6+
"""
7+
Extracts `tar_file` and puts the `members` to `path`.
8+
If members is None, all members on `tar_file` will be extracted.
9+
"""
10+
tar = tarfile.open(tar_file, mode="r:gz")
11+
if members is None:
12+
members = tar.getmembers()
13+
# with progress bar
14+
# set the progress bar
15+
progress = tqdm(members)
16+
for member in progress:
17+
tar.extract(member, path=path)
18+
# set the progress description of the progress bar
19+
progress.set_description(f"Extracting {member.name}")
20+
# or use this
21+
# tar.extractall(members=members, path=path)
22+
# close the file
23+
tar.close()
24+
25+
26+
def compress(tar_file, members):
27+
"""
28+
Adds files (`members`) to a tar_file and compress it
29+
"""
30+
# open file for gzip compressed writing
31+
tar = tarfile.open(tar_file, mode="w:gz")
32+
# with progress bar
33+
# set the progress bar
34+
progress = tqdm(members)
35+
for member in progress:
36+
# add file/folder/link to the tar file (compress)
37+
tar.add(member)
38+
# set the progress description of the progress bar
39+
progress.set_description(f"Compressing {member}")
40+
# close the file
41+
tar.close()
42+
43+
44+
# compress("compressed.tar.gz", ["test.txt", "test_folder"])
45+
# decompress("compressed.tar.gz", "extracted")
46+
47+
if __name__ == "__main__":
48+
import argparse
49+
parser = argparse.ArgumentParser(description="TAR file compression/decompression using GZIP.")
50+
parser.add_argument("method", help="What to do, either 'compress' or 'decompress'")
51+
parser.add_argument("-t", "--tarfile", help="TAR file to compress/decompress, if it isn't specified for compression, the new TAR file will be named after the first file to compress.")
52+
parser.add_argument("-p", "--path", help="The folder to compress into, this is only for decompression. Default is '.' (the current directory)", default="")
53+
parser.add_argument("-f", "--files", help="File(s),Folder(s),Link(s) to compress/decompress separated by ','.")
54+
55+
args = parser.parse_args()
56+
method = args.method
57+
tar_file = args.tarfile
58+
path = args.path
59+
files = args.files
60+
61+
# split by ',' to convert into a list
62+
files = files.split(",") if isinstance(files, str) else None
63+
64+
if method.lower() == "compress":
65+
if not files:
66+
print("Files to compress not provided, exiting...")
67+
exit(1)
68+
elif not tar_file:
69+
# take the name of the first file
70+
tar_file = f"{files[0]}.tar.gz"
71+
compress(tar_file, files)
72+
elif method.lower() == "decompress":
73+
if not tar_file:
74+
print("TAR file to decompress is not provided, nothing to do, exiting...")
75+
exit(2)
76+
decompress(tar_file, path, files)
77+
else:
78+
print("Method not known, please use 'compress/decompress'.")
79+

general/compressing-files/test.txt

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Some text

general/compressing-files/test_folder/subfolder1/textfile.txt

Whitespace-only changes.

general/compressing-files/test_folder/subfolder2/textfile.txt

Whitespace-only changes.

0 commit comments

Comments
 (0)