python - Scipy.optimize minimize error for Neural Network -
i'm building neural network , trying optimize theta parameters using minimize function scipy.optimize.
an interesting thing happening. when build network first 1,000 rows of data (by using nrows=1000), minimize function works fine. when change nrows=2000, code run returns output screenshotted in link below.
code:
fmin = minimize(fun=backprop, x0=params, args=(input_size, hidden_size, num_labels, x, y, learning_rate), method='tnc', jac=true, options={'maxiter': 250})
screenshot of output nrows=2000
since i'm using back-propagation , truncated newton algorithms, assumed because of na values in data. hence, began process running:
train.fillna(train.mean())
but still led output above. idea why? reference, below propagation function , screenshot of output nrows=1000.
def backprop(params, input_size, hidden_size, num_labels, x, y, learning_rate): m = x.shape[0] x = np.matrix(x) y = np.matrix(y) # reshape parameter array parameter matrices each layer theta1 = np.matrix(np.reshape(params[:hidden_size * (input_size + 1)], (hidden_size, (input_size + 1)))) theta2 = np.matrix(np.reshape(params[hidden_size * (input_size + 1):], (num_labels, (hidden_size + 1)))) # run feed-forward pass a1, z2, a2, z3, h = forward_propagate(x, theta1, theta2) # initializations j = 0 delta1 = np.zeros(theta1.shape) # (25, 401) delta2 = np.zeros(theta2.shape) # (10, 26) # compute cost in range(m): first_term = np.multiply(-y[i,:], np.log(h[i,:])) second_term = np.multiply((1 - y[i,:]), np.log(1 - h[i,:])) j += np.nansum(first_term - second_term) j = j / m # add cost regularization term j += ((learning_rate) / (2 * m)) * (np.nansum(np.power(theta1[:,1:], 2)) + np.nansum(np.power(theta2[:,1:], 2))) # perform backpropagation t in range(m): a1t = a1[t,:] # (1, 401) z2t = z2[t,:] # (1, 25) a2t = a2[t,:] # (1, 26) ht = h[t,:] # (1, 10) yt = y[t,:] # (1, 10) d3t = ht - yt # (1, 10) z2t = np.insert(z2t, 0, values=np.ones(1)) # (1, 26) d2t = np.multiply((theta2.t * d3t.t).t, sigmoid_gradient(z2t)) # (1, 26) delta1 = delta1 + (d2t[:,1:]).t * a1t delta2 = delta2 + d3t.t * a2t delta1 = delta1 / m delta2 = delta2 / m # add gradient regularization term delta1[:,1:] = delta1[:,1:] + (theta1[:,1:] * learning_rate) / m delta2[:,1:] = delta2[:,1:] + (theta2[:,1:] * learning_rate) / m # unravel gradient matrices single array grad = np.concatenate((np.ravel(delta1), np.ravel(delta2))) return j, grad
i'm aware additional data added more labelencoder() , onehotencoder() values - i'm not quite sure if contributed error. great guys, thanks!
Comments
Post a Comment