3 min read

Transform Pandas DataFrame with Natural Language


Pandas syntax can be verbose. Use Natural Language to transform Pandas DataFrame to write concise code.


David @ WiseData · June 28, 2023


Introduction

As data analysts and scientists, we spend a lot of time manipulating data in Pandas. While Pandas is a powerful tool, it can be time-consuming to write code to perform data transformations. That's where WiseData comes in. WiseData is a Python library that allows us to transform Pandas DataFrame using simple English commands.

In this post, we'll explore how to use WiseData to transform Pandas DataFrame with English.

You can download the Jupyter notebook from HERE.

Usage Instructions

1. Obtain an API Key

To use WiseData, you need to obtain an API Key. Simply visit https://www.wisedata.app/, fill out your email address. And the API Key used for Python package will be delivered to your email.

2. Installation

To install the library, use the following pip command:

pip install wisedata pandas numpy

3. Instantiation

Instantiate the WiseData class with your API key:

from wisedata import WiseData

# TODO: Copy your API key which you've received in your email here
wd = WiseData(api_key="YOUR_API_KEY")

4. Transform Data with English

Let's say we have a Pandas DataFrame with the following data:

country gdp happiness_index
United States 19294482071552 6.94
United Kingdom 2891615567872 7.16
France 2411255037952 6.66
Germany 3435817336832 7.07
Italy 1745433788416 6.38
Spain 1181205135360 6.4
Canada 1607402389504 7.23
Australia 1490967855104 7.22
Japan 4380756541440 5.87
China 14631844184064 5.12

We want to count number of countries. We can use pass English to WiseData to apply this transformation.

import pandas as pd

countries = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
    "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})

df = wd.transform("Give me number of countries", {
  "countries": countries
})
print(df)

The resulting DataFrame will look like this:

Number of Countries
0 10

5. Create pivot table with English

With WiseData, creating pivot table for Pandas DataFrame would become easier, too.

df = wd.transform("Give me gdp data pivotted by country", {
  "countries": countries
})
print(df)

Now you will get a DataFrame with new pivot table!

6. Putting everything together

Let’s put everything together into code:

from wisedata import WiseData

# TODO: Copy your API key which you've received in your email here
wd = WiseData(api_key="YOUR_API_KEY")

import pandas as pd

countries = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
    "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})

df = wd.transform("Give me number of countries", {
  "countries": countries
})
print(df)

df = wd.transform("Give me gdp data pivotted by country", {
  "countries": countries
})
print(df)

Benefits of transforming data with English using WiseData

Using WiseData to transform Pandas DataFrame with English has several benefits. First, it allows us to leverage natural language to manipulate DataFrame in Pandas. Second, it allows us to write more concise and readable code. With the pivot table example above, people who are reading the code can interpret how data transformation is happening easily compared to Pandas code.

Conclusion

In this blog post, we've explored how to use WiseData to transform Pandas data with natural language. By using simple English commands, we can perform data transformations without having to write lengthy Pandas code. This makes data analysis faster, more efficient, and more accessible to a wider range of users. If you're looking for a way to streamline your data analysis workflow, give WiseData a try @ https://www.wisedata.app!