Skip to content

p higher than 0.5 after training... #4

@andrisecker

Description

@andrisecker

Quick and hopefully not stupid question:
I'm trying to use the ConcreteDropout class to train a convnet (classifying images into 12 classes). The first strange thing I observed was that usually the first 3 convolutional layers have higher dropout probabilities than the dense layers afterwards (independent from N), but the one which actually makes me worry is that sometimes the probabilities are higher than 0.5... see sample output below:

print np.array([K.eval(layer.p) for layer in model.layers if hasattr(layer, "p")])
[0.59234613 0.4666404  0.2114246  0.10445894 0.10087071]

Full model structure:

N = len(train_images)
l = 1e-5  # lenghts scale parameter (tau - model precision parameter is 1 for classification)
wd = l**2. / N  # this will be the l2 weight regularizer
dd = 1. / N  # this will regularize dropout (depends only on dataset size)

K.clear_session()
model = Sequential()
model.add(ConcreteDropout(Convolution2D(24, (11, 11), strides=(4, 4),
                                        padding="same", activation="relu",
                                        kernel_initializer="he_uniform", bias_initializer="zeros",
                                        data_format="channels_last"),
                          weight_regularizer=wd, dropout_regularizer=dd,
                          input_shape=(256, 256, 1)))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding="valid", data_format="channels_last"))

model.add(ConcreteDropout(Convolution2D(96, (5, 5),
                                        padding="same", activation="relu",
                                        kernel_initializer="he_uniform", bias_initializer="zeros",
                                        data_format="channels_last"),
                          weight_regularizer=wd, dropout_regularizer=dd))
model.add(MaxPooling2D(pool_size=(3, 3), padding="valid", data_format="channels_last"))

model.add(ConcreteDropout(Convolution2D(96, (3, 3),
                                        padding="same", activation="relu",
                                        kernel_initializer="he_uniform", bias_initializer="zeros",
                                        data_format="channels_last"),
                          weight_regularizer=wd, dropout_regularizer=dd))
model.add(MaxPooling2D(pool_size=(3, 3), padding="valid", data_format="channels_last"))

model.add(Flatten())
model.add(ConcreteDropout(Dense(512, activation="relu",
                                kernel_initializer="he_uniform", bias_initializer="zeros"),
                          weight_regularizer=wd, dropout_regularizer=dd))

model.add(ConcreteDropout(Dense(512, activation="relu",
                                kernel_initializer="he_uniform", bias_initializer="zeros"),
                          weight_regularizer=wd, dropout_regularizer=dd))

model.add(Dense(12, activation="softmax",
                kernel_initializer="he_uniform", bias_initializer="zeros"))

opt = optimizers.SGD(lr=0.005, momentum=0.9, nesterov=True)

model.compile(loss="categorical_crossentropy",
              optimizer=opt,
              metrics=["categorical_accuracy"])

history = History()

@yaringal @joeyearsley

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions