1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] MongoDB showing inconsistent ressults while querying using pymongo

Discussão em 'Python' iniciado por Stack, Outubro 3, 2024 às 21:12.

  1. Stack

    Stack Membro Participativo

    I am working with pyMongo, connecting to a client's database. The specific collection I am querying is composed by documents with a field called items, which can be an empty list or contain a series of subdocuments. As an example, a document can look like this:

    [​IMG]

    I am interested in the documents that have a non empty list of items and for which the price is not 0. To query, I first tried directly with the API as follows:

    cursor=colecc_cotizaciones.find({"items":{"$ne":[]},"items.price":{"$ne":0},"date":{"$gte":datetime(2023,9,1),"$lt":datetime(2023,10,16)}})
    i =0
    for doc in cursor:
    i+=1
    print(i)


    For the dates specified, it only showed 42 results. I found that weird. The frequency should be greater. In order to do the analysis I created a DataFrame and put the data there with a json normalize:

    df = pd.json_normalize(
    no_vacias,
    record_path=['items'], # Path to the nested list you want to flatten
    meta=[
    'id',
    'date','city',
    'company'
    ]
    )


    And when I query on that DataFrame, I get a very different number of ressults:

    df_filtered = df[(df["date"] < "2023-10-16") & (df["date"] > "2023-09-01") & (df["price"] != 0)]
    df_filtered_grouped = df_filtered.groupby('id').first()
    df_filtered_grouped


    Notice that I am grouping by same id. Even though I get 217 rows, not 42. I do not quite get why I am getting different ressults.

    Thank you in advance.

    Continue reading...

Compartilhe esta Página