pandas groupby to nested json

Question

I often use pandas groupby to generate stacked tables. But then I often want to output the resulting nested relations to json. Is there any way to extract a nested json filed from the stacked table it produces?

Let's say I have a df like:

year office candidate  amount
2010 mayor  joe smith  100.00
2010 mayor  jay gould   12.00
2010 govnr  pati mara  500.00
2010 govnr  jess rapp   50.00
2010 govnr  jess rapp   30.00

I can do:

grouped = df.groupby('year', 'office', 'candidate').sum()

print grouped
                       amount
year office candidate 
2010 mayor  joe smith   100
            jay gould    12
     govnr  pati mara   500
            jess rapp    80

Beautiful! Of course, what I'd real like to do is get nested json via a command along the lines of grouped.to_json. But that feature isn't available. Any workarounds?

So, what I really want is something like:

{"2010": {"mayor": [
                    {"joe smith": 100},
                    {"jay gould": 12}
                   ]
         }, 
          {"govnr": [
                     {"pati mara":500}, 
                     {"jess rapp": 80}
                    ]
          }
}

Don

The code above doesn't actually work as the amount column (e.g. '$30') are strings so are added as strings rather than as numbers. Also, it's unclear what you want in terms of json output, why is to_json working for you? — Andy Hayden, Commented Jun 23, 2014 at 19:50

chrisb · Accepted Answer · 2014-06-24 00:38:42Z

I don't think think there is anything built-in to pandas to create a nested dictionary of the data. Below is some code that should work in general for a series with a MultiIndex, using a defaultdict

The nesting code iterates through each level of the MultIndex, adding layers to the dictionary until the deepest layer is assigned to the Series value.

In  [99]: from collections import defaultdict

In [100]: results = defaultdict(lambda: defaultdict(dict))

In [101]: for index, value in grouped.itertuples():
     ...:     for i, key in enumerate(index):
     ...:         if i == 0:
     ...:             nested = results[key]
     ...:         elif i == len(index) - 1:
     ...:             nested[key] = value
     ...:         else:
     ...:             nested = nested[key]

In [102]: results
Out[102]: defaultdict(<function <lambda> at 0x7ff17c76d1b8>, {2010: defaultdict(<type 'dict'>, {'govnr': {'pati mara': 500.0, 'jess rapp': 80.0}, 'mayor': {'joe smith': 100.0, 'jay gould': 12.0}})})

In [106]: print json.dumps(results, indent=4)
{
    "2010": {
        "govnr": {
            "pati mara": 500.0, 
            "jess rapp": 80.0
        }, 
        "mayor": {
            "joe smith": 100.0, 
            "jay gould": 12.0
        }
    }
}

@chrisb I am trying to adapt your answer to a similar problem here, but am tripped up by the grouped.intertuples(): stackoverflow.com/questions/37819622/… — spaine, Commented Jun 14, 2016 at 19:18
this will work only for three levels, what if there are more? — skt7, Commented Jun 8, 2018 at 19:09

Alexandr Fruman · Accepted Answer · 2019-10-03 08:42:12Z

11

I had a look at the solution above and figured out that it only works for 3 levels of nesting. This solution will work for any number of levels.

import json
levels = len(grouped.index.levels)
dicts = [{} for i in range(levels)]
last_index = None

for index,value in grouped.itertuples():

    if not last_index:
        last_index = index

    for (ii,(i,j)) in enumerate(zip(index, last_index)):
        if not i == j:
            ii = levels - ii -1
            dicts[:ii] =  [{} for _ in dicts[:ii]]
            break

    for i, key in enumerate(reversed(index)):
        dicts[i][key] = value
        value = dicts[i]

    last_index = index


result = json.dumps(dicts[-1])

edited Oct 3, 2019 at 8:42

Alexandr Fruman

872 silver badges11 bronze badges

answered Jun 8, 2018 at 19:39

skt7

1,2051 gold badge9 silver badges21 bronze badges

2

Love this answer. FYI: the latest versions of pandas replace line 2 with """levels = grouped.ndim"""
– Back2Basics
Commented Oct 23, 2018 at 18:17

Add a comment |

iNecas · Accepted Answer · 2019-10-26 07:11:26Z

2

Here is a generic recursive solution for this problem:

def df_to_dict(df):
    if df.ndim == 1:
        return df.to_dict()

    ret = {}
    for key in df.index.get_level_values(0):
        sub_df = df.xs(key)
        ret[key] = df_to_dict(sub_df)
    return ret

answered Oct 26, 2019 at 7:11

iNecas

1,7731 gold badge14 silver badges16 bronze badges

This solution does not group based on the first column in the data frame
– viru
Commented May 21, 2021 at 22:50

Add a comment |

Tom Dugovic · Accepted Answer · 2017-07-26 21:27:47Z

I'm aware this is an old question, but I came across the same issue recently. Here's my solution. I borrowed a lot of stuff from chrisb's example (Thank you!).

This has the advantage that you can pass a lambda to get the final value from whatever enumerable you want, as well as for each group.

from collections import defaultdict

def dict_from_enumerable(enumerable, final_value, *groups):
    d = defaultdict(lambda: defaultdict(dict))
    group_count = len(groups)
    for item in enumerable:
        nested = d
        item_result = final_value(item) if callable(final_value) else item.get(final_value)
        for i, group in enumerate(groups, start=1):
            group_val = str(group(item) if callable(group) else item.get(group))
            if i == group_count:
                nested[group_val] = item_result
            else:
                nested = nested[group_val]
    return d

In the question, you'd call this function like:

dict_from_enumerable(grouped.itertuples(), 'amount', 'year', 'office', 'candidate')

The first argument can be an array of data as well, not even requiring pandas.

Collectives™ on Stack Overflow

pandas groupby to nested json

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
python
json
pandas
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Not the answer you're looking for? Browse other questions tagged pythonjsonpandas or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
json
pandas
or ask your own question.