1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] Efficiently simulating many frequency-severity distributions over thousands of...

Discussão em 'Python' iniciado por Stack, Outubro 1, 2024 às 12:42.

  1. Stack

    Stack Membro Participativo

    I've got a problem at work that goes as follows:

    We have, say, 1 million possible events which define frequency-severity distributions. For each event we have an annual rate which defines a Poisson distribution, and alpha and beta parameters for a Beta distributions. The goal is to simulate in the order of >100,000 "years", each year being defined as getting a frequency N for each event, and getting N samples of the relative beta distribution.

    The cumbersome fact for me is how can I efficiently get N_i ~ Poisson(lambda_i) samples from Beta distribution Beta_i while also making sure I can attribute them to the correct year?

    In terms of output I'll need to look at both the maximum and total value of the samples per year, so temporarily I'm just storing it as an array of dictionaries (not intended to be the output format)

    years = 5000

    rng = np.random.default_rng()
    losses = []
    for year in range(years):
    occurences = rng.poisson(data['RATE'])
    annual_losses = []
    for idx, occs in enumerate(occurences):
    if occs > 0:
    event = data.iloc[idx]
    for occ in range(occs):
    loss = rng.beta(event['alpha'], event['beta']) * event['ExpValue']
    annual_losses.append(loss)
    annual_losses.append(0)
    losses.append({'year': year, 'losses': annual_losses})


    I've tried to follow Optimising Python/Numpy code used for Simulation however I can't seem to understand how to vectorise this code efficiently.

    Changes I've made before posting here are (times for 5000 years):

    • swapping from scipy to numpy (72s -> 66s)
    • calculating frequencies for all years in once outside the loop (66s -> 73s... oops)

    Ideally I'd like this to run as fast, or for as many iterations, as possible, and I was also running into issues with memory using scipy previously.

    Continue reading...

Compartilhe esta Página