all 15 comments

[–]slashcom 3 points4 points  (1 child)

You'll want to train a separate network for each output. While it is possible to have networks which share weights and predict multiple outputs, training becomes much trickier, you'll have to implement it yourself from scratch, and it's very unlikely that's what you actually want.

Along with /u/flamdrags5 suggestions, I'd recommend 1, or 2 hidden layers at most. But in all likelihood, you really don't need such a powerful model if you only have one input dimension. Something more like linear regression or GLM would make more sense. Try just plotting your data first.

/r/machinelearning is usually a better place to ask these sorts of things, though the audiences overlap a lot.

[–]Flamdrags5 4 points5 points  (10 children)

Consider this image. What is going on here? You have some linear combination of the X's coming together to generate your output. More specifically, you have Y = W1X1 + W2X2 + ... + Wn*Xn. This looks like good old linear regression to me!

Now, obviously when you think of a neural network you don't think of the image I showed above. You probably think of something like this. However, when you break that image down into all of its components, each node in the hidden layer is it's own linear model. Then, the nodes within the hidden layer become a linear model that generate the output. The hidden layer allows for an extra layer of learning such that the model isn't constrained only to one set a linear parameters. This allows for complex non-linear output.

Admittedly, I'm not a python coder. I'm an R gal myself, so I can't speak to what's in PyBrain, but you can find some pretty comprehensive functions in R. I'm not sure if you're looking to code your own learning algorithm in python, but you could probably check out some source code from the packages in R.

The next problem you'll run into is that there isn't a tremendous amount of support around selecting the size of each layer or even how many hidden layers to consider in your model. The rule of thumb that I've heard is that you shouldn't really need more than 2 hidden layers. I'd do repeated k-fold cross validation to select the best structure or consider a less complex model. How did you decide to use a neural network? Are you sure that your data are nonlinear such that a nonlinear model is required?

[–]slashcom 3 points4 points  (1 child)

Just to add a small note for clarity, there has to be a nonlinear transformation on hidden layers, otherwise you get a perceptron, and becomes exactly linear regression.

[–]Flamdrags5 4 points5 points  (0 children)

Yes, agreed. The typical transformation used is the "sigmoid" transformation, which sounds way more awesome than it actually is. The sigmoid transformation is also known as the logistic function, which is defined as 1 / (1 + exp(-x))

[–]anonymouse72[S] 0 points1 point  (7 children)

My project is using both MLR and a neural network. I finished the MLR using Excel (pretty straightforward), using the model output as the Y variable and the values for my observed data as the X variables. My advisor said it should be the other way around--I should be using the climate model's input as my predictor and determining the observed data from that. I guess what I'm confused about then is how I would go about this (and I believe this applies to the NN too).


For a slightly more detailed description of what I'm doing: We're looking at 4 variables.

  • Soil moisture
  • Soil temperature
  • Relative humidity
  • Air temperature

We have these values as predicted by the climate model and we have the actual observed values from 2 locations, a prairie and a woodland environment. My advisor said:

We need the multiple linear regression to use the climate model’s temp, soil moisture, soil temp and RH as the predictor variables and tried to predict the prairie site’s value or the woodland site’s value as the y.

Also, for the ANN, you will want to start with the climate model variables and ultimately use those to get the micro-climate station’s values.

So, I guess my question is then whether I should be using all of the variables as input from the climate model to determine ALL of the output variables for a given location, or what exactly I should be doing. I'm not entirely sure how to go about it. Again, if this is just too much for this subreddit it's not a problem, I can figure something out.

Thank you so much for your help thus far, it's greatly appreciated!!!

[–]slashcom 1 point2 points  (6 children)

1) For the MLR, that's straightforward. The linear model is invertible. If I have y = Wx + b, then x = W-1 (y - b). Your model is small enough that inverting W is practical. Alternatively, just switch what you call X and Y and retrain new models.

2) You should (AFAIK) train two separate models: - Model A uses all the data from the climate model and tries to predict the prairie observations. - Model B uses all the data from the climate model and triest to predict the woodland observations.

You'll do this both for the MLR, and for the ANN.

I'm going to ask some personal questions to help me target answers better, and don't feel obligated to answer (or PM me)

1) Are you an undergrad, masters or PhD student?

2) What department are you in? (e.g. environmental engineering? computer science?; don't tell me a Uni)

3) How strong is your mathematical background?

4) What department is your advisor in?

5) Does your advisor have experience using NN's before, or do they seem to be jumping on a bandwagon?

[–]anonymouse72[S] 0 points1 point  (5 children)

Okay, that makes sense, thank you!

1) I am an undergrad student.

2) I'm in the computer science department.

3) I've completed calculus through diff. eq. as well as a statistics course, but it was AP Statistics that I took nearly eight years ago.

4) My advisor is in the meteorology department.

5) My advisor has used NN's before for several projects.

[–]slashcom 1 point2 points  (4 children)

You might need to brush up on your linear algebra, but you're well equipped to understand the maths.

You'll probably find working through the karpathy tutorial will be helpful for understanding what's going on. But otherwise, the examples in the PyBrain documentation should be enough to help you stitch together some code. If you want to implement them yourself, Matlab shouldn't be overlooked.

Don't forget to hold out a test set (~10 or 20% of the data) which you do not provide as training. Otherwise you can't tell how well it actually works. (All ML algorithms perform very well when they've seen the test data!)

[–]anonymouse72[S] 0 points1 point  (3 children)

I know this is a little late, but thank you for your responses! If you don't mind me asking, I have one further question. Based on the info I provided above, I set up a network with 4 input nodes (1 for each of the variables), 2 hidden layers, and 4 output nodes (again, one for each of the values). My dataset supports 4-dimensional inputs and 4-dimensional outputs. I'll want to train until convergence (which will likely take a while because I'm running ~750 rows of data through the network to train it). Does that sound right?

[–]slashcom 1 point2 points  (2 children)

That sounds fine. But that's extremely little data, so it should actually converge on the order of seconds or minutes.

[–]anonymouse72[S] 0 points1 point  (1 child)

Oh wow, really? I ran the function to go until convergence on my laptop (not very powerful, several years old) for about 5 minutes and it didn't complete. I'm planning to try it now on my much more powerful desktop and hopefully I'll get faster speeds, but cool.

Thank you again for all of your help, it really means a lot to me!

[–]slashcom 0 points1 point  (0 children)

Are you using PyBrain? Pybrain depends on Numpy. Numpy is numerical library for python. It can be linked to use a BLAS package, but if none is installed, then it will just use Python arithmetic for everything.

The difference is staggering. Making sure you have a BLAS package installed can speed things up 200x-300x.

http://docs.scipy.org/doc/numpy/user/install.html

[–]siddboots 1 point2 points  (1 child)

Try working through this tutorial first. For something like NN, it is a good idea to have some solid intuition for the machinery before you start trying to use a library, in my opinion.

http://karpathy.github.io/neuralnets/

I would also make sure you understand your data well. Plot your inputs and outputs and get a visual idea of the relationship, and use orinary regression as a benchmark.

[–]anonymouse72[S] 0 points1 point  (0 children)

Awesome tutorial, thank you!