-
Notifications
You must be signed in to change notification settings - Fork 30
/
Copy path1-4-gradient.py
77 lines (68 loc) · 2.69 KB
/
1-4-gradient.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
#%%
# understanding gradients is a bit of math, but we'll try to keep this
# simple -- bascially a numerical gradient tells you which direction
# you need to move when you are machine learning -- positive or
# negative -- and well as having an actual numerical value
# that tells you 'how much' you should move
# a machine learning loop takes the gradeients for a group
# of tensor operations, and then updates the value of the tensors
# that are being 'learned' using the product of the gradients
# and the learning rate
# so if you know a bit of math, you know a gradient is a numerical
# representation of a derivative -- which means you know to ask
# 'a gradient with respect to what?' -- and the answer there
# is a loss function, you use gradients to figure the direction
# to go to make your loss smaller
# here is the simplest example from grade school algebra I could
# cook up -- before you knew algebra -- this was a bit of a
# mystery how to solve it, and I know I personally tried --
# just plain guessing the numbers to 'solve' equations --
# machine learning is a bit like that, but instead of just plain
# guessing, we use the gradient to figure how far off, and what
# our next guess should be --= OK
#%%
import torch
# X + 1 = 3 -- that's our little bit of algebra
# here is our random initial guess
X = torch.rand(1, requires_grad=True)
# and our formula
Y = X + 1.0
Y
#%%
# now, our loss is -- how far are we off from 3?
def mse(Y):
diff = 3.0 - Y
return (diff * diff).sum() / 2
#%%
# the gradient on our X -- that tells us which direction
# we are 'off' from the right answer -- let's look when we are too high
loss = mse(Y)
loss.backward()
X.grad
#%%
# now -- let's use that gradient to solve some grade school
# algebra with simple machine learning
learning_rate = 1e-3
# here is our learning loop
for i in range(0, 10000):
Y = X + 1.0
loss = mse(Y)
# here is the 'backpropagation' of the gradient
loss.backward()
# and here is the 'learning', so we turn off the graidents
# from being updated temporarily
with torch.no_grad():
# the gradient tells you which direction you are off
# so you go in the opposite direction to correct the problem
X -= learning_rate * X.grad
# and we zero out the gradients to get fresh values on
# each learning loop iteration
X.grad.zero_()
# and -- here is our answer
X
# OK -- you can see that this is approximate -- and that's an
# important point -- machine learning is going to approximate
# and you can control how close you get to the target answer
# by altering your learning rate or your number of iterations
# experiment with this by altering the `learning_rate`
# and the number of loops in `range`