1. Anuncie Aqui ! Entre em contato fdantas@4each.com.br

[Python] How to clean a json file to become a Dataframe?

Discussão em 'Python' iniciado por Stack, Outubro 7, 2024 às 10:42.

  1. Stack

    Stack Membro Participativo

    I am scraping basketball stats from MaxPreps (high school sports stats website). I successfully got the data into my vscode, but the json data is huge and messy. I can see all the proper numbers and player names, but it isn't in neat rows and columns. How would I go about getting the json file to be a neat rows/columns dataframe?

    I tried the pandas normalizing feature but wasn't sure what I was looking at. I tried comparing the output to NBA's stat website, and their data looked more organized as if their output was the table of stats from the website.

    My Code:

    import pandas as pd
    import requests
    pd.set_option('display.max_columns', None)
    import numpy as np

    test_url = 'https://production.api.maxpreps.com/gatewayweb/react/team-season-player-stats/rollup/v1?teamId=cb3c4816-4749-4381-8c4c-5613ef8c89c9&sportSeasonId=77be7c75-cdf9-483d-867f-ea2af557e731'
    url_json = requests.get(url=test_url).json()


    df_normal = pd.json_normalize(url_json)
    print(df_normal)
    #print(url_json)


    {'status': 200, 'message': 'Success', 'cacheResult': 'None', 'data': {'teamId': 'cb3c4816-4749-4381-8c4c-5613ef8c89c9', 'sportSeasonId': '77be7c75-cdf9-483d-867f-ea2af557e731', 'groups': [{'name': 'Game Stats', 'subgroups': [{'name': '', 'stats': {'columns': [{'name': 'Jersey', 'header': '#', 'displayName': '#', 'isSortedColumn': True, 'overallValue': None, 'sortDirection': 1, 'columnType': 1}, {'name': 'Name', 'header': 'Name', 'displayName': 'Name', 'isSortedColumn': False, 'overallValue': None, 'sortDirection': 0, 'columnType': 2}, {'name': 'GamesPlayed', 'header': 'GP', 'displayName': 'Games Played', 'isSortedColumn': True, 'overallValue': '29', 'sortDirection': 2, 'columnType': 0}, {'name': 'MinutesPerGame', 'header': 'MPG', 'displayName': 'Minutes Per Game', 'isSortedColumn': True, 'overallValue': '0', 'sortDirection': 2, 'columnType': 0}, {'name': 'PointsPerGame', 'header': 'PPG', 'displayName': 'Points Per Game', 'isSortedColumn': True, 'overallValue': '53.2


    ^^^json data

    How I want it to look

    parameters
    :
    {LeagueID: "00", PerMode: "Totals", StatCategory: "PTS", Season: "All Time", SeasonType: "Playoffs",…}
    resource
    :
    "leagueleaders"
    resultSet
    :
    {name: "LeagueLeaders",…}
    headers
    :
    ["PLAYER_ID", "PLAYER_NAME", "GP", "MIN", "FGM", "FGA", "FG_PCT", "FG3M", "FG3A", "FG3_PCT", "FTM",…]
    name
    :
    "LeagueLeaders"
    rowSet
    :
    [,…]
    [0 … 99]
    0
    :
    [2544, "LeBron James", 287, 11858, 2928, 5896, 0.497, 470, 1415, 0.332, 1836, 2479, 0.741, 430, 2153,…]
    1
    :
    [893, "Michael Jordan", 179, 7474, 2188, 4497, 0.487, 148, 446, 0.332, 1463, 1766, 0.828, 305, 847,…]
    2
    :
    [76003, "Kareem Abdul-Jabbar", 237, 8851, 2356, 4422, 0.533, 0, 4, 0, 1050, 1419, 0.74, 505, 1273,…]
    3
    :
    [977, "Kobe Bryant", 220, 8641, 2014, 4499, 0.448, 292, 882, 0.331, 1320, 1617, 0.816, 230, 889, 1119,…]


    {'resource': 'leagueleaders', 'parameters': {'LeagueID': '00', 'PerMode': 'Totals', 'StatCategory': 'PTS', 'Season': 'All Time', 'SeasonType': 'Playoffs', 'Scope': 'S', 'ActiveFlag': 'No'}, 'resultSet': {'name': 'LeagueLeaders', 'headers': ['PLAYER_ID', 'PLAYER_NAME', 'GP', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS', 'AST_TOV', 'STL_TOV', 'EFG_PCT', 'TS_PCT', 'GP_RANK', 'MIN_RANK', 'FGM_RANK', 'FGA_RANK', 'FG_PCT_RANK', 'FG3M_RANK', 'FG3A_RANK', 'FG3_PCT_RANK', 'FTM_RANK', 'FTA_RANK', 'FT_PCT_RANK', 'OREB_RANK', 'DREB_RANK', 'REB_RANK', 'AST_RANK', 'STL_RANK', 'BLK_RANK', 'TOV_RANK', 'PF_RANK', 'PTS_RANK', 'AST_TOV_RANK', 'STL_TOV_RANK', 'EFG_PCT1', 'TS_PCT1'], 'rowSet': [[2544, 'LeBron James', 287, 11858, 2928, 5896, 0.497, 470, 1415, 0.332, 1836, 2479, 0.741, 430, 2153, 2583, 2067, 483, 275, 1034, 655, 8162, 1.999, 0.467, 0.536, 0.584, 1, 1, 1, 1, 591, 3, 2, 714, 1, 1, 1262, 16, 1, 4, 2, 1, 10, 1, 8, 1, 469, 1078, 491, 405], [893, 'Michael Jordan', 179, 7474, 2188, 4497, 0.487, 148, 446, 0.332, 1463, 1766, 0.828, 305, 847, 1152, 1022, 376, 158, 546, 541, 5987, 1.872, 0.689, 0.503, 0.568, 19, 12, 3, 3, 65


    Stats from NBA ^^^

    Something similar like this was covered already, but it was using stats from NBA.com. I am not sure how it translates to the data I have from MaxPreps. I basically want to take my data MaxPreps and make it so it is in a clean data frame so I can begin graphing it.

    Continue reading...

Compartilhe esta Página