Skip to content

Commit 1a9f213

Browse files
Create CleanTextFile.py
1 parent 05a591d commit 1a9f213

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

CleanTextFile.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
import re
2+
3+
special_char_regex = re.compile(r'[^\w\s]')
4+
html_tags_regex = re.compile(r'<.*?>')
5+
new_line_regex = re.compile(r'\n')
6+
unwanted_spaces_regex = re.compile(r'\s+')
7+
8+
with open('input.txt', 'r') as input_file, open('output.txt', 'w') as output_file:
9+
for line in input_file:
10+
line = special_char_regex.sub('', line)
11+
line = html_tags_regex.sub('', line)
12+
line = new_line_regex.sub(' ', line)
13+
line = unwanted_spaces_regex.sub(' ', line)
14+
output_file.write(line + '\n')

0 commit comments

Comments
 (0)