python - Scipy.optimize minimize error for Neural Network -


i'm building neural network , trying optimize theta parameters using minimize function scipy.optimize.

an interesting thing happening. when build network first 1,000 rows of data (by using nrows=1000), minimize function works fine. when change nrows=2000, code run returns output screenshotted in link below.

code:

fmin = minimize(fun=backprop, x0=params, args=(input_size, hidden_size, num_labels, x, y, learning_rate),              method='tnc', jac=true, options={'maxiter': 250}) 

screenshot of output nrows=2000

since i'm using back-propagation , truncated newton algorithms, assumed because of na values in data. hence, began process running:

train.fillna(train.mean()) 

but still led output above. idea why? reference, below propagation function , screenshot of output nrows=1000.

def backprop(params, input_size, hidden_size, num_labels, x, y, learning_rate): m = x.shape[0] x = np.matrix(x) y = np.matrix(y)  # reshape parameter array parameter matrices each layer theta1 = np.matrix(np.reshape(params[:hidden_size * (input_size + 1)], (hidden_size, (input_size + 1)))) theta2 = np.matrix(np.reshape(params[hidden_size * (input_size + 1):], (num_labels, (hidden_size + 1))))  # run feed-forward pass a1, z2, a2, z3, h = forward_propagate(x, theta1, theta2)  # initializations j = 0 delta1 = np.zeros(theta1.shape)  # (25, 401) delta2 = np.zeros(theta2.shape)  # (10, 26)  # compute cost in range(m):     first_term = np.multiply(-y[i,:], np.log(h[i,:]))     second_term = np.multiply((1 - y[i,:]), np.log(1 - h[i,:]))     j += np.nansum(first_term - second_term)  j = j / m  # add cost regularization term j += ((learning_rate) / (2 * m)) * (np.nansum(np.power(theta1[:,1:], 2)) + np.nansum(np.power(theta2[:,1:], 2)))  # perform backpropagation t in range(m):     a1t = a1[t,:]  # (1, 401)     z2t = z2[t,:]  # (1, 25)     a2t = a2[t,:]  # (1, 26)     ht = h[t,:]  # (1, 10)     yt = y[t,:]  # (1, 10)      d3t = ht - yt  # (1, 10)      z2t = np.insert(z2t, 0, values=np.ones(1))  # (1, 26)     d2t = np.multiply((theta2.t * d3t.t).t, sigmoid_gradient(z2t))  # (1, 26)      delta1 = delta1 + (d2t[:,1:]).t * a1t     delta2 = delta2 + d3t.t * a2t  delta1 = delta1 / m delta2 = delta2 / m  # add gradient regularization term delta1[:,1:] = delta1[:,1:] + (theta1[:,1:] * learning_rate) / m delta2[:,1:] = delta2[:,1:] + (theta2[:,1:] * learning_rate) / m  # unravel gradient matrices single array grad = np.concatenate((np.ravel(delta1), np.ravel(delta2)))  return j, grad 

output nrows=1000

i'm aware additional data added more labelencoder() , onehotencoder() values - i'm not quite sure if contributed error. great guys, thanks!


Comments

Popular posts from this blog

asynchronous - C# WinSCP .NET assembly: How to upload multiple files asynchronously -

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -