Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update spell_checker.py #52

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 46 additions & 13 deletions hanspell/spell_checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@
import json
import time
import sys
import re
from cachetools import TTLCache
from urllib import parse
from collections import OrderedDict
import xml.etree.ElementTree as ET

Expand All @@ -17,7 +20,24 @@

_agent = requests.Session()
PY3 = sys.version_info[0] == 3
cache = TTLCache(maxsize = 10, ttl = 3600)

def read_token():
try:
TOKEN = cache.get('PASSPORT_TOKEN')
return TOKEN
except KeyError:
return None

def update_token(agent):

Copy link

@codnjs042 codnjs042 May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

좋은 코드 감사합니다. 그런데 오류가 발생해요.
UnboundLocalError: local variable 'TOKEN' referenced before assignment
33줄에 TOKEN = None 추가해줬더니 정상 작동합니다!

++ spell_checker.check 최초 실행 시 오류가 나고 두 번째 실행부터 정상 작동되네요. 지속적인 오류로 착각했습니다.
TOKEN = None 추가 시 최초 실행부터 오류 없이 작동됩니다.

html = agent.get(url='https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=1&ie=utf8&query=맞춤법검사기')

match = re.search('passportKey=([a-zA-Z0-9]+)', html.text)
if match is not None:
TOKEN = parse.unquote(match.group(1))
cache['PASSPORT_TOKEN'] = TOKEN
return TOKEN

def _remove_tags(text):
text = u'<content>{}</content>'.format(text).replace('<br>','')
Expand All @@ -28,6 +48,30 @@ def _remove_tags(text):

return result

def get_response(TOKEN, text):

if TOKEN is None:
TOKEN = update_token(_agent)

payload = {
'passportKey' : TOKEN,
'q': text,
'color_blindness': 0
}

headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
'referer': 'https://search.naver.com/',
}

r = _agent.get(base_url, params=payload, headers=headers)
data = json.loads(r.text)

if 'error' in data['message'] :
r = get_response(update_token(_agent), text)

return r


def check(text):
"""
Expand All @@ -43,21 +87,10 @@ def check(text):
# 최대 500자까지 가능.
if len(text) > 500:
return Checked(result=False)

payload = {
'color_blindness': '0',
'q': text
}

headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
'referer': 'https://search.naver.com/',
}


start_time = time.time()
r = _agent.get(base_url, params=payload, headers=headers)
r = get_response(read_token(), text)
passed_time = time.time() - start_time

data = json.loads(r.text)
html = data['message']['result']['html']
result = {
Expand Down