Different Ways to Flatten Deeply Nested Jsons into a Pandas Data Frame

sunku sowmya Sree
2 min readJan 7, 2021
Source — Google

JSON(JavaScript Object Notation) is now a frequently used file format. Whether it’s a response from API or a MongoDB Collection object. Most People are now tending to use JSON because they have human-readable Text in a prettify format.

In this article, we will be seeing different ways on how to read and flatten a nested JSON and what are the modules used for it.

To start with Let’s see how to read a sample json

sample_json = {‘Name’:’Joey’, ‘Address’:{‘City’:’Bangalore’,’State’:’KA’}, ‘Country’:’India’}

Flattened data frame using son_normalize

Looks good!! This is how json_normalize can be used to flatten semi-structured JSON. But this alone can’t be used to flatten deeply nested Jsons.

Let’s look at another example

Sample Nested JSON

In this nested sample even if we use json_normalize still we can see a few columns which are nested

I have tried different methods on how to flatten nested columns in a data frame and written the below function.

Function

Using this function it will be easier to flatten any nested column in a data frame. Make use of this function and try to pass a list of columns that are nested in your data frame and this function will flatten those columns and return an updated data frame

Flattened DataFrame

An iPython notebook with the codes mentioned in the post is available here

Any Suggestions and comments are welcome. I hope this helps :)

--

--

sunku sowmya Sree

Data Science Enthusiast | Piano Player | Senior Software Engineer