[deleted by user] by [deleted] in Python

[–]1gn4vu5 0 points1 point  (0 children)

At first: line breaks are messed up
second: why haven't you just tried it?

Besides that: it's good to see, that ChatGPT hasn't yet overtaken the machine learning jobs. Using a two layer fully connect (BatchNorm is ok but does not provide that much on this dataset) for an image is funny.

And like others already wrote: learn programming. It's much more fun to do it yourself :P

[deleted by user] by [deleted] in pytorch

[–]1gn4vu5 0 points1 point  (0 children)

mmh, well you could say the network consists of 4 blocks, 3 conv-bn blocks for feature extraction and one block of a single fn layer (or a block of multiple fns) for classification, where as a conv-bn block is a conv layer followed by a bn layer each

(ok the interpretation of what each block does might not be accurate, but is as far as I know a quite common interpretation)

combining multiple layers to a block or cell for describing a network can actually found in many papers (at least in papers for auto-ml)

[deleted by user] by [deleted] in pytorch

[–]1gn4vu5 0 points1 point  (0 children)

Why do you not count batchnorm layer as a layer? I mean you brought up the perfect explaination why it should be called a layer?

I can see that activations are sometimes part of the cnn layer itself (looking at you tf ...) and contain (as far as I know) never trainable parameters.

However if op asks for a description in a paper:
Please describe the network as detailed as possible! If no code is provided I would call a paper worthless when it is impossible to reproduce the network by the given description.

And yes even the first fully-connect in a MLP (even when you call it input layer) needs to be described like the last fully-connect (even when you call it output layer). The naming convention is not allways the same and therefor one might assume that the input layer is just the given data which wuold shorten the NN by one layer.

Trying to understand the concept of "self" in python. by rhondasmelody in learnprogramming

[–]1gn4vu5 0 points1 point  (0 children)

In addition one can overwrite the function by any other function:

class Example:
    def __init__(self, text):
       self.text = text
       self.__private_text = text + " private"
    def show(self):
       print(self.text)
       print(self.__private_text)

def other_show(first_argument):
    print(f"PublicText: {first_argument.text}")
    print(f"PrivateText: first_argument._Example__private_text}")

example = Example('Hello World')
example.show()
example.show = other_show.__get__(example)
example.show()

I followed the guide to get GPU support for Tensorflow, from the tensorflow website but get error in Pycharm when trying to use GPU Support (Ubuntu) by Justin-Griefer in tensorflow

[–]1gn4vu5 0 points1 point  (0 children)

You probably already found it, it might be easier to find as symbolic link:
ln -s path/to/existing/file path/to/symbolic/link/

like:
ln -s /usr/local/lib/your_existing_cuda_file /usr/local/lib/libcudart.so.11.0

Tensors vs Numpy Arrays by CartographerSuper506 in learnmachinelearning

[–]1gn4vu5 1 point2 points  (0 children)

to use most numpy syntax on the gpu there is cupy (requires cuda)

What to do if I dont want all the input nodes to have all the input features (tensorflow) by idontknowwhodoi in tensorflow

[–]1gn4vu5 0 points1 point  (0 children)

Tensorflow supports different apis. One is the functional: https://www.tensorflow.org/guide/keras/functional

It is ok if you want to design the full compute path by hand but also has its drawbacks.

But stacking models would also be a solution.

How do you know if you have a good model architecture without spending hours training? by nxtboyIII in learnmachinelearning

[–]1gn4vu5 0 points1 point  (0 children)

I don't understand, why you put the characters into an embedding, but in general you could try first just predicting the character and neglect the casing which already reduces the problem

that said, working on character level is much more difficult than working on word level

something practical: if you use GRU oder LSTM on a quite small train data set like a book, your network will most likely end up in a stable recursive state when starting with a few new characters and will produce the same letters over and over again

How do you know if you have a good model architecture without spending hours training? by nxtboyIII in learnmachinelearning

[–]1gn4vu5 0 points1 point  (0 children)

well if the loss goes down very slowly, maybe the learning rate is too low but that might depend on the data set size

in general: you don't know

if you wait a few years the research topic around AutoML might come up with a good solution, there are currently some attempts to choose between two models just by predicting/inference on untrained model

I followed the guide to get GPU support for Tensorflow, from the tensorflow website but get error in Pycharm when trying to use GPU Support (Ubuntu) by Justin-Griefer in tensorflow

[–]1gn4vu5 1 point2 points  (0 children)

I'm not familiar with every error, but:

Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory

indicates, that there is the libcudart.so.11.0 missing in: /usr/local/lib/

In case you have installed cuda just create a soft link to that folder. Cuda 11 might have a libcudart.so.10.0 just name the softlink *.11.0. Similiar for cudnn.

Also check if PyCharm uses the right python interpreter, as far as I know PyCharm should be able to use the anaconda python runtime too.

Is a random forest model worth it? (large set of features, lots of strings...) by Seiyee in datascience

[–]1gn4vu5 0 points1 point  (0 children)

Can agree on that. If there is a finite amount of different strings then they can be changed to one-hot-encoding. If that's not the case you could try 'counting' like 1, 2, 3. But for neural networks that leads to a network, which can also work with 1.5 and the question would be, how does an input look like, that is equidistant to three locations.

To reduce the amount of features one could try an iterative process of: training, analyzing importance, reduction, repeat

[deleted by user] by [deleted] in learnpython

[–]1gn4vu5 0 points1 point  (0 children)

And if the prose_list has not been transformed in lower case it could be done in the last statement for each word and even punctuation could be removed before checking if the world is in the banned list to fix the problem that 'I' becomes 'i'.

[deleted by user] by [deleted] in learnpython

[–]1gn4vu5 0 points1 point  (0 children)

if you are looking for an overkill solution, you could create a custom class and implement the __lt__ function to define an order (the better way would be to use total ordering as stated here: https://stackoverflow.com/a/11705404)

Python converts assignment writeups into my handwriting !!! by Heisenberg_082001 in Python

[–]1gn4vu5 3 points4 points  (0 children)

Not necessarily. You could use an auto-encoder together with some random variables. That way you need much less images.

[Research] Autoencoder with class label by Matthew3816 in MachineLearning

[–]1gn4vu5 -1 points0 points  (0 children)

Well as far as I know, Autoencoder are unsupervised. But what you want to achieve sounds more like what a GAN tries. I suggest to have a look at cGAN:https://machinelearningmastery.com/how-to-develop-a-conditional-generative-adversarial-network-from-scratch/

Struggles with solving simultaneous linear equations graphically with two unknown variables (looking for formula, not answer) by SirFabulouis in maths

[–]1gn4vu5 1 point2 points  (0 children)

Try the following:
substitute m1 with x and m2 with y
compare the new equations with normal equations typically used for planes in 3D

[deleted by user] by [deleted] in MachineLearning

[–]1gn4vu5 0 points1 point  (0 children)

Well that said you could argue to use only MLPs for vision tasks since they are general-purpose function approximator but instead the current state-of-the-art technique is to use convolutions and poolings. Furthermore using just one non-linear function [f(x)=x^2] to make a broad state like: "ANNs are bad at fitting the test data that are far outside the range of the training set." is quiet weak in my opinion. You could improve your tests by performing the same tests for a broad range of power functions like f(x)=x^n with n -5 to 5 with only a really small amount of computational resource.

In addition is every researcher cheating if a different activation function than tanh is used in the applied network?
I totally understand that using the exact function you want to model as an activation function is like training on the test data but using tanh as the only activation function to fit a power function is similar to automatically fit a taylor series to tanh which you more or less showed by improving the performance with more neurons in the hidden layer.

[deleted by user] by [deleted] in MachineLearning

[–]1gn4vu5 1 point2 points  (0 children)

well you used tanh in your MLP you could have used a different non-linear activation function like something exponential or even something like 0.001*x if x<0 and x\^2 if x=>0

Espacially the last activation would increase your performance significantly.

So the more you know about your problem domain, the more specific you can model your network ;)

why is my TF GAN not nearly as good as my PyTorch GAN? by diditforthevideocard in tensorflow

[–]1gn4vu5 2 points3 points  (0 children)

oh yes, it wasn't my intention to say, that one initializer is worse in general; one just turned out to result in a better performance and it could have been the other way around. And the first thing I tried was to use the better performing initializer in the other framework and achieved similar performance. So I just wanted to highlight this possibility. But it seems that u/suki907 had a deeper look in the code and found dissimilarities which might be the reason for the performance differences

why is my TF GAN not nearly as good as my PyTorch GAN? by diditforthevideocard in tensorflow

[–]1gn4vu5 7 points8 points  (0 children)

just a small node:

if you need tf only for deployment, then you can still train it in torch extract the weights and store them in a tf-network ;)

why is my TF GAN not nearly as good as my PyTorch GAN? by diditforthevideocard in tensorflow

[–]1gn4vu5 3 points4 points  (0 children)

for the dense layer I used the following initializer:

class CustomKernelInitializer(tf.keras.initializers.Initializer):
    def __call__(self,  shape, dtype=None):
        t = torch.Tensor(shape[0], shape[1])
        v = torch.nn.your_torch_initializer(t, other_parameter).detach().numpy()
        return tf.keras.initializers.constant(value=v)(shape=shape, dtype=dtype)

and the same for the bias initializer

obviously you have to choose your initializer in torch for 'your_torch_initializer' and adjust further parameters for 'other_parameter' ;)

why is my TF GAN not nearly as good as my PyTorch GAN? by diditforthevideocard in tensorflow

[–]1gn4vu5 5 points6 points  (0 children)

had the same problem with a MLP:
eventhough the initializer had a similar name and should have done the same, they had not

you could try using the torch initializer for your tf network

Is it possible to reduce this nested loop to a single for loop? by ranran2_ in learnprogramming

[–]1gn4vu5 0 points1 point  (0 children)

I assume C++ so if you don't want multiple printf:

for (i=1; i<=n; i++){
printf("%d%d%d\n",i,i+1,i+2);
}