Skip to content

Commit 6fb09e6

Browse files
authored
Merge pull request selfteaching#3 from selfteaching/master
daily merge to local branch.
2 parents ac4f088 + e9fbcdb commit 6fb09e6

File tree

210 files changed

+17576
-1100
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

210 files changed

+17576
-1100
lines changed

.vscode/launch.json

Lines changed: 0 additions & 70 deletions
This file was deleted.

.vscode/settings.json

Lines changed: 0 additions & 3 deletions
This file was deleted.

19100101/Shawn/d11_training1.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
import requests
2+
import yagmail
3+
import getpass
4+
from pyquery import PyQuery
5+
from mymodule import stats_word
6+
7+
# 设置发件人、登录密码、收件人
8+
sender = input('请输入发件人邮箱地址:')
9+
psw = input('请输入发件人邮箱登录密码:')
10+
recipient = input('请输入收件人邮箱地址:')
11+
smtp = 'smtp.qq.com'
12+
13+
# 访问url获取微信公众号文章
14+
response = requests.get('https://mp.weixin.qq.com/s/pLmuGoc4bZrMNl7MSoWgiA')
15+
16+
# 提取微信公众号正文
17+
document = PyQuery (response.text)
18+
content = document ('#js_content').text()
19+
20+
# 统计前100词频
21+
statList = stats_word.stats_text(content,100)
22+
statString = ''.join(str(i) for i in statList)
23+
24+
# 将统计结果发送到
25+
yagmail.SMTP(sender,psw,smtp).send(recipient,'shawn',statString)

19100101/WangRui0802/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,4 +69,7 @@ Day10
6969
可在自学的过程中那些先入为主的认知会不停地跑出来。每解决一个问题都会运行一个程序,啊~原来是这样的,这么简单,你居然花了半个小时。完全就是吃到第七个烧饼饱了就忘了前六个的功劳。道理懂啊,但是多快好省地完成任务,节约时间,这些鬼东西就会干扰自己。给自己学习技能留下充足的预算,比你天真以为的很长还要长。刻意思考需要刻意练习哪些地方。过程就是这样,谁知道下一次是第几个烧饼能让你吃饱啊。看来要改诨号为慢慢来了!
7070
大家好,我是慢慢来!
7171

72-
72+
Day11
73+
慢慢来今天要被一个问题弄死了!“from mymodule import stats_word”这行导入函数一直报错。
74+
把文件目录截图给教练和战友看了,也没看出啥来。我就不再纠结了,很简单粗暴的从头来了一遍。重新建文件夹,重新写函数。这个报错就没有再出现过。也许我们是要认真分析错误的原因,但是在入门的时候,真的不妨从头再来,不怕从零开始去反复。
75+

19100101/WangRui0802/d11_training1.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Date:19/03/28
2+
3+
import requests
4+
import yagmail
5+
import getpass
6+
from pyquery import PyQuery
7+
from mymodule import stats_word
8+
9+
sender = input('请输入发件人邮箱地址:')
10+
psw = input('请输入发件人邮箱登录密码:')
11+
recipient = input('请输入收件人邮箱地址:')
12+
smtp = 'smtp.qq.com'
13+
14+
response = requests.get('https://mp.weixin.qq.com/s/pLmuGoc4bZrMNl7MSoWgiA')
15+
16+
document = PyQuery (response.text)
17+
content = document ('#js_content').text()
18+
19+
statList = stats_word.stats_text(content, 100)
20+
statString = ''.join(str(i) for i in statList)
21+
22+
yagmail.SMTP(sender,psw,smtp).send(recipient,'19100101 WangRui0802',statString)
23+
24+

19100101/WangRui0802/main.py

Lines changed: 11 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,58 +1,13 @@
1-
text = '''
2-
愚公移山
3-
4-
太行,王屋二山的北面,住了一個九十歲的老翁,名叫愚公。二山佔地廣闊,擋住去路,使他和家人往來極為不便。
5-
6-
一天,愚公召集家人說:「讓我們各盡其力,剷平二山,開條道路,直通豫州,你們認為怎樣?」
7-
大家都異口同聲贊成,只有他的妻子表示懷疑,並說:「你連開鑿一個小丘的力量都沒有,怎可能剷平太行、王屋二山呢?況且,鑿出的土石又丟到哪裏去呢?」
8-
9-
大家都熱烈地說:「把土石丟進渤海裏。」
10-
於是愚公就和兒孫,一起開挖土,把土石搬運到渤海去。
11-
愚公的鄰居是個寡婦,有個兒子八歲也興致勃勃地走來幫忙。
12-
寒來暑往,他們要一年才能往返渤海一次。
13-
14-
住在黃河河畔的智叟,看見他們這樣辛苦,取笑愚公說:「你不是很愚蠢嗎?你已一把年紀了,就是用盡你的氣力,也不能挖去山的一角呢?」
15-
16-
愚公歎息道:「你有這樣的成見,是不會明白的。你比那寡婦的小兒子還不如呢!就算我死了,還有我的兒子,我的孫子,我的曾孫子,他們一直傳下去。而這二山是不會加大的,總有一天,我們會把它們剷平。」
17-
18-
智叟聽了,無話可說:
19-
二山的守護神被愚公的堅毅精神嚇倒,便把此事奏知天帝。天帝佩服愚公的精神,就命兩位大力神揹走二山。
20-
21-
How The Foolish Old Man Moved Mountains
22-
23-
Yugong was a ninety-year-old man who lived at the north of two high mountains, Mount Taixing and Mount Wangwu.
24-
25-
Stretching over a wide expanse of land, the mountains blocked yugong’s way making it inconvenient for him and his family to get around.
26-
One day yugong gathered his family together and said,”Let’s do our best to level these two mountains. We shall open a road that leads to Yuzhou. What do you think?”
27-
28-
All but his wife agreed with him.
29-
“You don’t have the strength to cut even a small mound,” muttered his wife. “How on earth do you suppose you can level Mount Taixin and Mount Wanwu? Moreover, where will all the earth and rubble go?”
30-
“Dump them into the Sea of Bohai!” said everyone.
31-
32-
So Yugong, his sons, and his grandsons started to break up rocks and remove the earth. They transported the earth and rubble to the Sea of Bohai.
33-
34-
Now Yugong’s neighbour was a widow who had an only child eight years old. Evening the young boy offered his help eagerly.
35-
36-
Summer went by and winter came. It took Yugong and his crew a full year to travel back and forth once.
37-
38-
On the bank of the Yellow River dwelled an old man much respected for his wisdom. When he saw their back-breaking labour, he ridiculed Yugong saying,”Aren’t you foolish, my friend? You are very old now, and with whatever remains of your waning strength, you won’t be able to remove even a corner of the mountain.”
39-
40-
Yugong uttered a sigh and said,”A biased person like you will never understand. You can’t even compare with the widow’s little boy!”
41-
42-
“Even if I were dead, there will still be my children, my grandchildren, my great grandchildren, my great great grandchildren. They descendants will go on forever. But these mountains will not grow any taler. We shall level them one day!” he declared with confidence.
43-
44-
The wise old man was totally silenced.
45-
When the guardian gods of the mountains saw how determined Yugong and his crew were, they were struck with fear and reported the incident to the Emperor of Heavens.
46-
47-
Filled with admiration for Yugong, the Emperor of Heavens ordered two mighty gods to carry the mountains away.
48-
'''
49-
50-
with open('tang300.json') as t:
51-
read_file = t.read()
1+
with open('tang300.json') as t :
2+
''' 1. 导入json文件并读取文件内容'''
3+
read_file = t.read()
524
t.closed
535

54-
from mymodule import stats_word #导入stats_word模块
55-
try:
56-
print('Top20中文词频统计结果:', stats_word.stats_text_cn(read_file, 20))
57-
except ValueError as a :
58-
print(a)
6+
from mymodule import stats_word
7+
''' 1. 捕获传入非字符串参数异常。
8+
2. 调用stats_word.py中的stats_text_cn(),传入读取文件结果和输出限制参数。
9+
'''
10+
try :
11+
print('Top20中文词频统计结果: ', stats_word.stats_text_cn(read_file,20))
12+
except ValueError as ve :
13+
print(ve)
Lines changed: 30 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -1,73 +1,46 @@
1-
# 一个中英混杂的文本
2-
str = '''
3-
The Zen of Python, by Tim Peters
4-
5-
6-
Beautiful is better than ugly.
7-
Explicit is better than implicit.
8-
Simple is better than complex.
9-
Complex is better than complicated.
10-
Flat is better than nested.
11-
Sparse is better than dense.
12-
Readability counts.
13-
Special cases aren't special enough to break the rules.
14-
Although practicality beats purity.
15-
Errors should never pass silently.
16-
Unless explicitly silenced.
17-
In the face of ambxiguity, refuse the temptation to guess.
18-
There should be one-- and preferably only one --obvious way to do it.
19-
Although that way may not be obvious at first unless you're Dutch.
20-
Now is better than never.
21-
Although never is often better than *right* now.
22-
If the implementation is hard to explain, it's a bad idea.
23-
If the implementation is easy to explain, it may be a good idea.
24-
Namespaces are one honking great idea -- let's do more of those!
25-
Python是一种计算机程序设计语言。是一种动态的、面向对象的脚本语言,最初被设计用于编写自动化脚本(shell),随着版本的不断更新和语言新功能的添加,越来越多被用于独立的、大型项目的开发。
26-
'''
27-
281
import collections
292
import re
303
import jieba
31-
32-
def stats_text_en(en,count):
33-
''' 1. 英文词频统计。
4+
5+
def stats_text_en(en,count) :
6+
''' 1. 英文词频统计:使用正则表达式过滤英文字符,使用Counter统计并排序
347
2. 参数类型检查,不为字符串抛出异常。
358
'''
36-
if type(en) == str:
37-
text_en = re.sub("[^A-Za-z]", " ", text_en.strip())
9+
if type(en) == str :
10+
text_en = re.sub("[^A-Za-z]", " ", en.strip())
3811
enList = text_en.split( )
3912
return collections.Counter(enList).most_common(count)
40-
else:
41-
42-
raise ValueError('it is not str')
43-
44-
def stats_text_cn(cn,count):
45-
''' 1. 汉字字频统计
46-
2. 参数类型检查,不为字符串抛出异常。
13+
else :
14+
raise ValueError ('type of argumengt is not str')
15+
16+
def stats_text_cn(cn,count) :
17+
''' 1. 使用jieba第三方库精确模式分词。
18+
2. 使用正则表达式过滤汉字字符。
19+
3. 使用for循环判断分词后词频列表元素长度大于等于2的生成新列表。
20+
4. 使用标准库collections.Counter()统计词频并限制统计数量。
21+
5. 参数类型检查,不为字符串抛出异常。
4722
'''
4823
if type(cn) == str :
49-
cnList = re.findall(u'[\u4e00-\u9fff]+', text_cn.strip())
24+
cnList = re.findall(u'[\u4e00-\u9fff]+', cn.strip())
5025
cnString = ''.join(cnList)
51-
segList = jieba.cut(cnString, cut_all=False)
26+
segList = jieba.cut(cnString,cut_all=False)
5227
cnnewList = []
53-
for i in segList:
54-
if len(i) >= 2
55-
cnnewList.append(i)
56-
else:
57-
pass
58-
countList = collections.Counter(newString).most_common(count)
28+
for i in segList :
29+
if len(i) >= 2 :
30+
cnnewList.append(i)
31+
else :
32+
pass
33+
countList = collections.Counter(cnnewList).most_common(count)
5934
return countList
60-
else:
61-
62-
raise ValueError ('it is not str')
63-
35+
else :
36+
raise ValueError ('type of argumengt is not str')
37+
6438
def stats_text(text_en_cn,count_en_cn) :
65-
''' 1. 合并英汉词频统计
39+
''' 1. 合并英汉词频统计:调用stats_text_en()和stats_text_cn()并合并其结果。
6640
2. 参数类型检查,不为字符串抛出异常。
6741
'''
68-
if type(text_en_cn) == str:
42+
if type(text_en_cn) == str :
6943
return (stats_text_en(text_en_cn,count_en_cn)+stats_text_cn(text_en_cn,count_en_cn))
70-
else:
71-
72-
raise ValueError ('it is not str')
73-
44+
else :
45+
raise ValueError ('type of argumengt is not str')
46+

19100101/YanHuiii/d11_training1.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
'''这是一个通过网络请求获得网页内容,使用分词工具对中文字符串
2+
进行分词,统计词频,得出结果,并发送到指定邮箱的程序'''
3+
import requests
4+
import pyquery
5+
from pyquery import PyQuery
6+
from mymodule import stats_word
7+
8+
'''访问网址'''
9+
image_url = "https://mp.weixin.qq.com/s/pLmuGoc4bZrMNl7MSoWgiA"
10+
'''将网址中的内容全部赋值给response'''
11+
response = requests.get(image_url)
12+
'''提取网址中的正文内容'''
13+
document = pyquery.PyQuery(response.text)
14+
content = document('#js_content').text()
15+
import getpass
16+
sender = input('输入发件人邮箱:')
17+
password = getpass.getpass('输入发件人邮箱密码(可复制粘贴):')
18+
recipients = input('输入收件人邮箱:')
19+
20+
import yagmail
21+
22+
yag = yagmail.SMTP(user=sender,password=password,host='smtp.163.com')
23+
contents = [stats_word.stats_text_cn(content,100)]
24+
yag.send(recipients,'主题:张小龙微信公开课演讲稿中文词频前100名统计',contents)
25+

19100101/lidong2119/d11_training1.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
import requests
2+
import yagmail
3+
import getpass
4+
from pyquery import PyQuery
5+
from mymodule import stats_word
6+
7+
image_url = "https://mp.weixin.qq.com/s/pLmuGoc4bZrMNl7MSoWgiA"
8+
response = requests.get(image_url)
9+
document = PyQuery(response.text)
10+
content = document('#js_content').text()
11+
12+
sender = input('输入发件人邮箱地址:')
13+
password = getpass.getpass('输入发件人邮箱密码(可复制粘贴):')
14+
recipient = input('输入收件人邮箱地址:')
15+
16+
17+
yag = yagmail.SMTP(user=sender,password=password,host='smtp.qq.com')
18+
content = [stats_word.stats_text_cn(content,100)]
19+
yag.send(recipient,'lidong2119 d11',content)

19100101/qiming09/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
2019.3.28
2+
1. learn to use 3rd-party library "requests","yagmail","pyquery".
3+
2. use "requests" to get information from a url, get the content, cut the words and email them.
4+
15
2019.3.27
26
1. learn to use conda to install 3rd-party library "jieba".
37
2. use "jieba" to cut string into words.

19100101/qiming09/d11_training1.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# this is d11 excercise for 3rd-party library yagmail,requests,pyquery
2+
# date : 2019.3.28
3+
# author by : qiming
4+
5+
import requests
6+
import yagmail
7+
import getpass
8+
from pyquery import PyQuery
9+
from mymodule import stats_word
10+
11+
# 设置发件人、登录密码、收件人
12+
sender = input('请输入发件人邮箱地址:')
13+
psw = input('请输入发件人邮箱登录密码:')
14+
recipient = input('请输入收件人邮箱地址:')
15+
smtp = 'smtp.163.com'
16+
17+
# 访问url获取微信公众号文章
18+
response = requests.get('https://mp.weixin.qq.com/s/pLmuGoc4bZrMNl7MSoWgiA')
19+
20+
# 提取微信公众号正文
21+
document = PyQuery (response.text)
22+
content = document ('#js_content').text()
23+
24+
# 统计前100词频
25+
statList = stats_word.stats_text(content,100)
26+
statString = ''.join(str(i) for i in statList)
27+
28+
# 将统计结果发送到 [email protected],title: 19100101 qiming
29+
yagmail.SMTP(sender,psw,smtp).send(recipient,'19100101 qiming',statString)

0 commit comments

Comments
 (0)