Different Ways to Flatten Deeply Nested Jsons into a Pandas Data Frame
JSON(JavaScript Object Notation) is now a frequently used file format. Whether it’s a response from API or a MongoDB Collection object. Most People are now tending to use JSON because they have human-readable Text in a prettify format.
In this article, we will be seeing different ways on how to read and flatten a nested JSON and what are the modules used for it.
To start with Let’s see how to read a sample json
sample_json = {‘Name’:’Joey’, ‘Address’:{‘City’:’Bangalore’,’State’:’KA’}, ‘Country’:’India’}
Looks good!! This is how json_normalize can be used to flatten semi-structured JSON. But this alone can’t be used to flatten deeply nested Jsons.
Let’s look at another example
In this nested sample even if we use json_normalize still we can see a few columns which are nested
I have tried different methods on how to flatten nested columns in a data frame and written the below function.
Using this function it will be easier to flatten any nested column in a data frame. Make use of this function and try to pass a list of columns that are nested in your data frame and this function will flatten those columns and return an updated data frame
An iPython notebook with the codes mentioned in the post is available here
Any Suggestions and comments are welcome. I hope this helps :)