1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Inverse problem / autoencoder neural network loss function plateau for high accuracy...

Discussão em 'Python' iniciado por Stack, Outubro 7, 2024.

  1. Stack

    Stack Membro Participativo

    This question partially inherits from a pervious question regrading curve fit to approximate the Schott dispersion formula of a glass material, so that given only 2 values n_e and V_e, the regression could fit a curve that predicts the 6 coefficients of the Schott formula (lambda is wavelength in micrometers):

    [​IMG]

    In that question, the kind gentlemen pointed out my mistake and the program was able to fit a curve. However, after some experiments the fitted curve turned out to be not accurate enough. For example, given a n_e = 1.7899 and V_e = 48, using the predicted 6 coefficients to reconstruct the dispersion, the calculated n_e was around 1.8 and V_e around 33, too inaccurate for optical simulation.

    I then decided to try neural network to predict the 6 coefficients.

    The 6 coefficients can be used to calculate the n_e by setting lambda equals 546.07 nm, and V_e can be calculated by 2 other wavelengths using the defintion:

    [​IMG]

    Where:

    lambda_e = 0.54607 # e-line (546.07 nm)
    lambda_Fp = 0.47999 # F'-line (479.99 nm)
    lambda_Cp = 0.64385 # C'-line (643.85 nm)


    This relationship makes it an inverse problem similar to an autoencoder. So I incorporated the Schott formula into the process:

    def schott_dispersion(A, lam):
    A0, A1, A2, A3, A4, A5 = A[:, 0], A[:, 1], A[:, 2], A[:, 3], A[:, 4], A[:, 5]
    n_squared = A0 + A1* lam + A2 * lam**(-2) + A3 * lam**(-4) + A4 * lam**(-6) + A5 * lam**(-8)

    n_squared = tf.clip_by_value(n_squared, clip_value_min=1e-6, clip_value_max=tf.float32.max)

    return tf.sqrt(n_squared)


    And use the Schott formula to customize the loss function:

    def custom_loss(y_true, y_pred):

    # Split y_true into x_true (n_e and V_e) and the actual coefficients
    n_e_true = y_true[:, 0] # Refractive index n_e
    V_e_true = y_true[:, 1] # Abbe number V_e
    schott_true = y_true[:, 2:] # True Schott coefficients

    # Calculate refractive indices for e, F' and C' lines
    n_e_pred = schott_dispersion(y_pred, lambda_e)
    n_F_pred = schott_dispersion(y_pred, lambda_Fp)
    n_C_pred = schott_dispersion(y_pred, lambda_Cp)

    # Calculate predicted Abbe number V_e
    epsilon = 1e-6 # Small value to prevent division by zero
    V_e_pred = (n_e_pred - 1) / (tf.abs(n_F_pred - n_C_pred) + epsilon)

    # Compute the loss
    loss_n_e = tf.square(n_e_true - n_e_pred)
    loss_V_e = tf.square(V_e_true - V_e_pred)

    total_loss = tf.reduce_mean(loss_n_e + loss_V_e)
    return total_loss


    At the end, use this loss function to train the model:

    def build_model():
    model = models.Sequential()
    model.add(layers.Dense(64, activation='relu', input_shape=(2,))) # Input: (n_e, V_e)
    model.add(layers.Dense(256, activation='relu'))
    model.add(layers.Dropout(0.1))
    model.add(layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.0001)))
    model.add(layers.Dense(6)) # Output: (y1, y2, y3, y4, y5, y6)
    return model

    scaler_X = StandardScaler()

    X_train = input_data.values
    y_train = concat.values

    X_train_scaled = scaler_X.fit_transform(X_train)

    model = build_model()
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001, clipvalue=1.0)
    model.compile(optimizer=optimizer, loss=custom_loss, run_eagerly=True)

    lr_scheduler = tf.keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.5, patience=5, min_lr=1e-6)
    history = model.fit(X_train_scaled, y_train, epochs=100, batch_size=32)


    and run the training.

    However, the loss fluctuates around 3000-300,000 and was never able to converge, the eventual prediction is vastly inaccurate and entirely unusable. But given my rather limited experience, I am not sure which part is causing the problem. Is there any way that can help improving the training?

    Appendix


    The training data (a csv table) is linked here.

    The following code is used to read the data:

    df = pd.read_csv(path)
    input_data = df[['n_e', 'V_e']].dropna()
    concat = df[['n_e', 'V_e', 'A0', 'A1', 'A2', 'A3', 'A4', 'A5']].dropna()


    I am also suspecting the goal itself to be unreliable or unreachable. The material that inspired all these question is a glass with n_e=1.7899 and V_e=48 described in patent FR1233449. I believe it is a special glass manufactured during mid 90s in the Leica fab lab for the Summilux 35mm f/1.4. And in a library of over 3,000 modern glasses, none has the same parameter, maybe it's just too special?

    This turned out to be such a long post... I appreciate everyone's help and opinion, even just by reading this.

    Continue reading...

Compartilhe esta Página