Studying Python (14) ~ Handwritten number recognition-learning process (5)



Hello, this is swim-lover. I’ve just started Python, and I’m studying with the concept of “learning while using”. In the 13th session, we learned about partial differential and gradient. I also tried using 3D plots. This time as well, we will continue to study according to the reference books.

Reference Book

As a reference book for machine learning, I used “Deep Learning from scratch, O’Reilly Japan, September 2016” by Yasuki Saito.

Gradient Plot

Last time, we calculated the gradient values for the three gradient points.

Next, let’s check the gradient with a vector diagram.

Python code is getting complicated.

・Function where numerical_grad (f, x) calculates the gradient at one point

・Create input coordinate data (x0, x1) with np.arange ()

・Create a set of data (X, Y) like (0,0), (0,1), (1,0), (1,1) with meshgrid ()

・Create an X, Y array with np.array and a transpose with .T. (Make it a vertically long matrix)

・X of numerical_grad_multi (f, X) has multiple coordinates, and numerical_grad (f, x) is repeatedly read from numerical_grad_multi (f, X).

・Extract a set of idx, x (x0, x1) with enumerate (X)

import numpy as np
import matplotlib.pylab as plt
from mpl_toolkits.mplot3d import Axes3D

def numerical_grad(f,x):
  #h = 10e-50  # bad example, too small value
  h = 1e-4    # good example
  grad=np.zeros_like(x)  #make zero data
  for idx in range(x.size):
    tmp = x[idx]
    #calc f(x+h)
    x[idx]=tmp + h #add h only x[idx]
    fxh1 = f(x)

    #calc f(x-h)
    x[idx]=tmp - h #subs h only x[idx]
    fxh2 = f(x)

    #calc grad about ixd
    x[idx]=tmp  #restore tmp
  return grad

def numerical_grad_multi(f, X):
  grad = np.zeros_like(X)
  for idx, x in enumerate(X):#extract index and x data
    grad[idx] = numerical_grad(f, x) #call child func
  return grad

def func_x0_x1(x):
  return x[0]**2+x[1]**2

x0 = np.arange(-2, 2.0, 0.25) # make x0 data
x1 = np.arange(-2, 2.0, 0.25) # make x1 data
X, Y = np.meshgrid(x0, x1) #make lattice point
#print("array X len={}n".format(X.size))

X = X.flatten() #convert one dim
Y = Y.flatten() #convert one dim
indata = np.array([X,Y]).T

grad = numerical_grad_multi(func_x0_x1,indata).T #convert 2xN to Nx2

plt.quiver(X, Y, -grad[0]/np.sqrt(pow(grad[0],2)+pow(grad[1],2)), -grad[1]/np.sqrt(pow(grad[0],2)+pow(grad[1],2)),np.sqrt(pow(grad[0],2)+pow(grad[1],2)),cmap="jet")
plt.xlim([-2, 2])
plt.ylim([-2, 2])

quiver () is used for the vector Plot. The size of the vector is shown in color instead of the length of the arrow.

The direction of the arrow is the direction that reduces the value of the function most. It is also written in the reference books that it is an important point.


This time, I did a Plot of gradient. It’s getting a bit more complicated Python code.

I would like to continue learning about neural network learning.