1
+ {
2
+ "nbformat" : 4 ,
3
+ "nbformat_minor" : 0 ,
4
+ "metadata" : {
5
+ "colab" : {
6
+ "name" : " 네이버 영화 리뷰 데이터 분석 1. 영화 정보 크롤링" ,
7
+ "provenance" : [],
8
+ "collapsed_sections" : [],
9
+ "include_colab_link" : true
10
+ },
11
+ "kernelspec" : {
12
+ "name" : " python3" ,
13
+ "display_name" : " Python 3"
14
+ }
15
+ },
16
+ "cells" : [
17
+ {
18
+ "cell_type" : " markdown" ,
19
+ "metadata" : {
20
+ "id" : " view-in-github" ,
21
+ "colab_type" : " text"
22
+ },
23
+ "source" : [
24
+ " <a href=\" https://colab.research.google.com/github/ndb796/Python-Data-Analysis-and-Image-Processing-Tutorial/blob/master/29.%20%EB%84%A4%EC%9D%B4%EB%B2%84%20%EC%98%81%ED%99%94%20%EB%A6%AC%EB%B7%B0%20%EB%8D%B0%EC%9D%B4%ED%84%B0%20%EB%B6%84%EC%84%9D%20%E2%91%A0%20%EC%98%81%ED%99%94%20%EC%A0%95%EB%B3%B4%20%ED%81%AC%EB%A1%A4%EB%A7%81/%EB%84%A4%EC%9D%B4%EB%B2%84%20%EC%98%81%ED%99%94%20%EB%A6%AC%EB%B7%B0%20%EB%8D%B0%EC%9D%B4%ED%84%B0%20%EB%B6%84%EC%84%9D%20%E2%91%A0%20%EC%98%81%ED%99%94%20%EC%A0%95%EB%B3%B4%20%ED%81%AC%EB%A1%A4%EB%A7%81.ipynb\" target=\" _parent\" ><img src=\" https://colab.research.google.com/assets/colab-badge.svg\" alt=\" Open In Colab\" /></a>"
25
+ ]
26
+ },
27
+ {
28
+ "cell_type" : " markdown" ,
29
+ "metadata" : {
30
+ "id" : " _hiyITsUk0Ze" ,
31
+ "colab_type" : " text"
32
+ },
33
+ "source" : [
34
+ " ## 네이버 영화 리뷰 데이터 분석 1. 영화 정보 크롤링\n " ,
35
+ " [강의 노트](https://github.com/ndb796/Python-Data-Analysis-and-Image-Processing-Tutorial/blob/master/29.%20%EB%84%A4%EC%9D%B4%EB%B2%84%20%EC%98%81%ED%99%94%20%EB%A6%AC%EB%B7%B0%20%EB%8D%B0%EC%9D%B4%ED%84%B0%20%EB%B6%84%EC%84%9D%20%E2%91%A0%20%EC%98%81%ED%99%94%20%EC%A0%95%EB%B3%B4%20%ED%81%AC%EB%A1%A4%EB%A7%81/%EB%84%A4%EC%9D%B4%EB%B2%84%20%EC%98%81%ED%99%94%20%EB%A6%AC%EB%B7%B0%20%EB%8D%B0%EC%9D%B4%ED%84%B0%20%EB%B6%84%EC%84%9D%20%E2%91%A0%20%EC%98%81%ED%99%94%20%EC%A0%95%EB%B3%B4%20%ED%81%AC%EB%A1%A4%EB%A7%81.pdf)"
36
+ ]
37
+ },
38
+ {
39
+ "cell_type" : " markdown" ,
40
+ "metadata" : {
41
+ "id" : " xCYPvtsxkN6P" ,
42
+ "colab_type" : " text"
43
+ },
44
+ "source" : [
45
+ " **리뷰 정보 클래스 작성하기**"
46
+ ]
47
+ },
48
+ {
49
+ "cell_type" : " code" ,
50
+ "metadata" : {
51
+ "id" : " RWjUFzDCkPN3" ,
52
+ "colab_type" : " code" ,
53
+ "colab" : {}
54
+ },
55
+ "source" : [
56
+ " import urllib.request\n " ,
57
+ " from bs4 import BeautifulSoup\n " ,
58
+ " \n " ,
59
+ " class Review:\n " ,
60
+ " def __init__(self, comment, date, star, good, bad):\n " ,
61
+ " self.comment = comment\n " ,
62
+ " self.date = date\n " ,
63
+ " self.star = star\n " ,
64
+ " self.good = good\n " ,
65
+ " self.bad = bad\n " ,
66
+ " \n " ,
67
+ " def show(self):\n " ,
68
+ " print(\" 내용: \" + self.comment +\n " ,
69
+ " \"\\ n날짜: \" + self.date +\n " ,
70
+ " \"\\ n별점: \" + self.star +\n " ,
71
+ " \"\\ n좋아요: \" + self.good +\n " ,
72
+ " \"\\ n싫어요: \" + self.bad)"
73
+ ],
74
+ "execution_count" : 0 ,
75
+ "outputs" : []
76
+ },
77
+ {
78
+ "cell_type" : " markdown" ,
79
+ "metadata" : {
80
+ "id" : " VpgHVILNkf4_" ,
81
+ "colab_type" : " text"
82
+ },
83
+ "source" : [
84
+ " **리뷰 정보 크롤링 함수**"
85
+ ]
86
+ },
87
+ {
88
+ "cell_type" : " code" ,
89
+ "metadata" : {
90
+ "id" : " XLW9RIx9kZU8" ,
91
+ "colab_type" : " code" ,
92
+ "colab" : {}
93
+ },
94
+ "source" : [
95
+ " def crawl(url):\n " ,
96
+ " soup = BeautifulSoup(urllib.request.urlopen(url).read(), \" html.parser\" )\n " ,
97
+ " review_list = []\n " ,
98
+ " title = soup.find('h3', class_='h_movie').find('a').text\n " ,
99
+ " div = soup.find(\" div\" , class_=\" score_result\" )\n " ,
100
+ " data_list = div.select(\" ul > li\" )\n " ,
101
+ " \n " ,
102
+ " for review in data_list:\n " ,
103
+ " star = review.find(\" div\" , class_=\" star_score\" ).text.strip()\n " ,
104
+ " reply = review.find(\" div\" , class_=\" score_reple\" )\n " ,
105
+ " comment = reply.find(\" p\" ).text\n " ,
106
+ " date = reply.select(\" dt > em\" )[1].text.strip()\n " ,
107
+ " button = review.find(\" div\" , class_=\" btn_area\" )\n " ,
108
+ " sympathy = button.select(\" strong > span\" )\n " ,
109
+ " good = sympathy[0].text\n " ,
110
+ " bad = sympathy[1].text\n " ,
111
+ " review_list.append(Review(comment, date, star, good, bad))\n " ,
112
+ " \n " ,
113
+ " return title, review_list"
114
+ ],
115
+ "execution_count" : 0 ,
116
+ "outputs" : []
117
+ },
118
+ {
119
+ "cell_type" : " markdown" ,
120
+ "metadata" : {
121
+ "id" : " p8Qqj8-ck3zA" ,
122
+ "colab_type" : " text"
123
+ },
124
+ "source" : [
125
+ " **리뷰 정보 크롤링 실습**"
126
+ ]
127
+ },
128
+ {
129
+ "cell_type" : " code" ,
130
+ "metadata" : {
131
+ "id" : " HYBdejwgk5Hg" ,
132
+ "colab_type" : " code" ,
133
+ "colab" : {
134
+ "base_uri" : " https://localhost:8080/" ,
135
+ "height" : 474
136
+ },
137
+ "outputId" : " 8c75bb48-88c4-4846-9667-9e9c2f046408"
138
+ },
139
+ "source" : [
140
+ " title, review_list = crawl(\" https://movie.naver.com/movie/bi/mi/basic.nhn?code=36944\" )\n " ,
141
+ " print('제목: ' + title)\n " ,
142
+ " for review in review_list:\n " ,
143
+ " review.show()"
144
+ ],
145
+ "execution_count" : 8 ,
146
+ "outputs" : [
147
+ {
148
+ "output_type" : " stream" ,
149
+ "text" : [
150
+ " 제목: 올드보이\n " ,
151
+ " 내용: 이 영화는 필요 이상으로 너무 잘만들었다. 인간이 만든 작품이 아니다. \n " ,
152
+ " 날짜: 2013.06.09 17:59\n " ,
153
+ " 별점: 10\n " ,
154
+ " 좋아요: 2859\n " ,
155
+ " 싫어요: 174\n " ,
156
+ " 내용: 충격적인 영화 촬영 기법, 스토리, 눈물샘을 자극시키는 사운드트랙. 대중영화 예술에 큰 기여를 한 혁명적인 영화. \n " ,
157
+ " 날짜: 2013.06.09 01:08\n " ,
158
+ " 별점: 10\n " ,
159
+ " 좋아요: 1843\n " ,
160
+ " 싫어요: 76\n " ,
161
+ " 내용: 사람은 상상력이 있어서 비겁해 지는거래... \n " ,
162
+ " 날짜: 2013.07.17 14:26\n " ,
163
+ " 별점: 10\n " ,
164
+ " 좋아요: 1642\n " ,
165
+ " 싫어요: 62\n " ,
166
+ " 내용: 10년만에 다시 본 올드보이. 역시 최고였다. \n " ,
167
+ " 날짜: 2013.07.28 01:53\n " ,
168
+ " 별점: 10\n " ,
169
+ " 좋아요: 1274\n " ,
170
+ " 싫어요: 60\n " ,
171
+ " 내용: 지금껏본 영화중 제일 재미있었다스토리 전개 하나도나무랄데 없는 작품 \n " ,
172
+ " 날짜: 2013.06.06 23:11\n " ,
173
+ " 별점: 10\n " ,
174
+ " 좋아요: 1118\n " ,
175
+ " 싫어요: 66\n "
176
+ ],
177
+ "name" : " stdout"
178
+ }
179
+ ]
180
+ }
181
+ ]
182
+ }
0 commit comments