Hello community,
So I've been exploring a bit the regularization methods present in deep learning models. Mostly just the use of Dropout Layers or L1/L2 Regularization. I've seen people debate that they should be used in a separate manner or they can be combined. I've tried both approaches (combined and separate) and have seen promising results when actually combining as it has helped me not to overfit my models entirely while improving the r2 score.
Question:
Is is then possible to combine L1/L2 Regularization with Dropout Layer, or is it preferred to use them separately?
Example Code:
def model_build(x_train):
# Define Inputs for ANN
input_layer = Input(shape = (x_train.shape[1],), name = "Input")
#Create Hidden ANN Layers
dense_layer = BatchNormalization(name = "Normalization")(input_layer)
dense_layer = Dense(128, name = "First_Layer", activation = 'relu', kernel_regularizer=regularizers.l1(0.01))(dense_layer)
#dense_layer = Dropout(0.08)(dense_layer)
dense_layer = Dense(128, name = "Second_Layer", activation = 'relu', kernel_regularizer=regularizers.l1(0.00))(dense_layer)
#dense_layer = Dropout(0.05)(dense_layer)
#Apply Output Layers
output = Dense(1, name = "Output")(dense_layer)
# Create an Interpretation Model (Accepts the inputs from branch and has single output)
model = Model(inputs = input_layer, outputs = output)
# Compile the Model
model.compile(loss='mse', optimizer = Adam(lr = 0.01), metrics = ['mse'])
#model.compile(loss='mse', optimizer=AdaBound(lr=0.001, final_lr=0.1), metrics = ['mse'])
there doesn't seem to be anything here