-
Notifications
You must be signed in to change notification settings - Fork 0
703. Kth Largest Element in a Stream.md #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
541c355
2e26259
69d18a9
9c9ce52
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,208 @@ | ||
703. Kth Largest Element in a Stream | ||
思考順に書いていく。 | ||
step1. pass | ||
```python | ||
class KthLargest: | ||
|
||
def __init__(self, k: int, nums: List[int]): | ||
self.k = k | ||
self.nums = nums | ||
|
||
def add(self, val: int) -> int: | ||
self.nums.append(val) | ||
if self.nums == None: | ||
return None | ||
self.nums.sort(reverse=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://docs.python.org/ja/3.6/library/bisect.html |
||
return self.nums[self.k - 1] | ||
``` | ||
通ったが、1834msでかなり遅い。。。 | ||
最初関数を跨いだメンバの引用の仕方とsort関数の使い方を忘れていてpythonのドキュメントを読み確認した。 | ||
(https://docs.python.org/3/tutorial/classes.html#class-and-instance-variables) | ||
(https://docs.python.org/3/library/stdtypes.html#list.sort) | ||
|
||
最初与えられた値をリストに追加し、降順にソートした上でk番目の値を吐き出した。 | ||
リストも値も存在しない場合はNoneを返すようにした。 | ||
|
||
こんなに遅いのはおそらくpythonのsortメソッドのせいかな?と思い調べた。 | ||
(https://www.naukri.com/code360/library/difference-between-sort-and-sorted-in-python) | ||
>he sort() function compares the first two elements of the list and swaps them if they are not in order. It then compares the next element with the first element, switches them if necessary, and moves on to the next element until the entire input list is sorted. | ||
|
||
>O(nlog(n)) : The time complexity of sort() function in Python is O(n log n) on average, and in the worst case, where n is the number of elements in the list to be sorted. This is because sort() use the timsort algorithm, which has this time complexity. | ||
|
||
timesortは挿入ソートとマージソートのハイブリッド | ||
(https://en.wikipedia.org/wiki/Timsort) | ||
|
||
O(nlog(n))なら悪くなさそうだけど何でこんなに遅いんだろう。。。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 処理全体を通しての時間計算量を求め、そこから処理時間を推定することをお勧めいたします。処理時間の推定方法は、過去に自分がほかの方のソースコードにコメントしたものがありますので、探されることをお勧めいたします。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. わざわざ探していただきありがとうございます🙇♂️ |
||
|
||
他の人のコードも見てみる。 | ||
(https://github.com/Fuminiton/LeetCode/pull/8) | ||
heap系の関数あったんだなー | ||
(https://github.com/katataku/leetcode/pull/8) | ||
あーkが負の場合を想定して弾くコードを書いた方がいいな。 | ||
|
||
ヒープを用いて解いてみる。 | ||
(https://docs.python.org/ja/3.13/library/heapq.html) | ||
step2 | ||
```python | ||
class KthLargest: | ||
import heapq | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. import は class 内では行わないことが多いと思います。 https://google.github.io/styleguide/pyguide.html#22-imports
|
||
|
||
def __init__(self, k: int, nums: List[int]): | ||
self.k = k | ||
if self.k <= 0: | ||
return print("The value of k is Error!") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. これは、print した後に、その返り値である None が返ります。 |
||
self.nums = nums | ||
heapify(self.nums) | ||
|
||
def add(self, val: int) -> int: | ||
heappush(self.nums,val) | ||
if self.nums == None: | ||
return None | ||
while len(self.nums) > self.k: | ||
heappop(self.nums) | ||
return self.nums[0] | ||
``` | ||
初めてheapqを使ったので手間取った。 | ||
今回は20ms | ||
ヒープの方がstep1より遅そうなイメージあったけどかなり早いことに驚いた、 | ||
なんでこんなに早いの? | ||
heapqのcpythonのライブラリを読んだ。 | ||
(https://github.com/python/cpython/blob/3.8/Lib/heapq.py) | ||
```python | ||
def heappush(heap, item): | ||
"""Push item onto heap, maintaining the heap invariant.""" | ||
heap.append(item) | ||
_siftdown(heap, 0, len(heap)-1) | ||
|
||
def heappop(heap): | ||
"""Pop the smallest item off the heap, maintaining the heap invariant.""" | ||
lastelt = heap.pop() # raises appropriate IndexError if heap is empty | ||
if heap: | ||
returnitem = heap[0] | ||
heap[0] = lastelt | ||
_siftup(heap, 0) | ||
return returnitem | ||
return lastelt | ||
|
||
def _siftdown(heap, startpos, pos): | ||
newitem = heap[pos] | ||
# Follow the path to the root, moving parents down until finding a place | ||
# newitem fits. | ||
while pos > startpos: | ||
parentpos = (pos - 1) >> 1 | ||
parent = heap[parentpos] | ||
if newitem < parent: | ||
heap[pos] = parent | ||
pos = parentpos | ||
continue | ||
break | ||
heap[pos] = newitem | ||
|
||
def _siftup(heap, pos): | ||
endpos = len(heap) | ||
startpos = pos | ||
newitem = heap[pos] | ||
# Bubble up the smaller child until hitting a leaf. | ||
childpos = 2*pos + 1 # leftmost child position | ||
while childpos < endpos: | ||
# Set childpos to index of smaller child. | ||
rightpos = childpos + 1 | ||
if rightpos < endpos and not heap[childpos] < heap[rightpos]: | ||
childpos = rightpos | ||
# Move the smaller child up. | ||
heap[pos] = heap[childpos] | ||
pos = childpos | ||
childpos = 2*pos + 1 | ||
# The leaf at pos is empty now. Put newitem there, and bubble it up | ||
# to its final resting place (by sifting its parents down). | ||
heap[pos] = newitem | ||
_siftdown(heap, startpos, pos) | ||
``` | ||
ざっと読んだ。 | ||
(https://stackoverflow.com/questions/38806202/whats-the-time-complexity-of-functions-in-heapq-library) | ||
ここまで読んでみたが、なぜstep1よりstep2がこんなに早いのか分からない。。。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://note.com/map1000da/n/n02d2fefa4343 |
||
どなたか分かる方がいればご教授お願いします | ||
|
||
step3 | ||
1回目 1分56秒 | ||
2回目 1分31秒 | ||
3回目 1分57秒 | ||
```python | ||
class KthLargest: | ||
import heapq | ||
|
||
def __init__(self, k: int, nums: List[int]): | ||
self.k = k | ||
if k <= 0: | ||
return print("The value of k is Error") | ||
self.nums = nums | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. この代入文だと、処理後入力のリストが書き換えられます。 |
||
heapify(self.nums) | ||
|
||
def add(self, val: int) -> int: | ||
heappush(self.nums,val) | ||
if self.nums == None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pep8では、== Noneを使うのは推奨されないようです。
|
||
return None | ||
while len(self.nums) > self.k: | ||
heappop(self.nums) | ||
return self.nums[0] | ||
``` | ||
|
||
レビューを元にコードを書きなおす。 | ||
まずstep1をbisectを用いて書き直す。 | ||
bisectはascendingされていることを前提にしているので注意。 | ||
```python | ||
class KthLargest: | ||
def __init__(self, k: int, nums: List[int]): | ||
if k <= 0: | ||
raise ValueError("k has to be bigger than 0!") | ||
self.k = -k | ||
self.nums = nums[:] | ||
self.nums.sort() | ||
|
||
def add(self, val: int) -> int: | ||
bisect.insort(self.nums,val) | ||
return self.nums[self.k] | ||
``` | ||
28msでstep1よりかなり高速になった。 | ||
ただ最初にkをマイナスにする必要があるから読みにくいかも。 | ||
>0 <= nums.length <= 10^4 | ||
1 <= k <= nums.length + 1 | ||
-10^4 <= nums[i] <= 10^4 | ||
-10^4 <= val <= 10^4 | ||
|
||
len(nums)の最大値が10^4である。 | ||
__init__でsort()はnlognなので、10^4*log10^4。nはnums.length | ||
>At most 10^4 calls will be made to add. | ||
|
||
addではbisectが二分探索を使っているのでlogn | ||
これが最大10^4呼び出されるので、10^4*log10^4 | ||
よって最大で10^4*log10^4 + 10^4*log10^4 ~ 2*10^4*3.3 = 6.6*10^4 | ||
pythonは大体1秒間に100万ステップ=10^6ステップ処理できるので | ||
1秒以内に収まる。 | ||
step1の場合は、addの中でいちいちsortしていたのでaddが10^4呼び出されると、約10^4*10^4*3.3 = 3.3*10^8 | ||
うん、だから遅かったのか。 | ||
|
||
次はheapを用いた解法 | ||
```python | ||
import heapq | ||
class KthLargest: | ||
def __init__(self, k: int, nums: List[int]): | ||
if k <= 0: | ||
raise ValueError("The value of k is Error!") | ||
self.k = k | ||
self.nums = nums[:] | ||
heapify(self.nums) | ||
|
||
def add(self, val: int) -> int: | ||
heappush(self.nums,val) | ||
while len(self.nums) > self.k: | ||
heappop(self.nums) | ||
return self.nums[0] | ||
``` | ||
8msでかなり早かった。 | ||
heappushとheappopの時間計算量はlogn、heapifyはnなので | ||
__init__では最大10^4 | ||
addではself.kが1でlen(self.nums)が10^4のとき | ||
log10^4 + log10^4 | ||
addが最大10^4呼び出されるのでこのコード全体の最悪計算量は10^4 + 10^4(2*log10^4) ~ 10^4 + 24 * 10^4 ~ 2.4*10^5 | ||
速い。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.numsはNoneになり得ないので、この処理はいらないかと思います。