NOTICE

The upcoming release of Featuretools 1.0.0 contains several breaking changes. Users are encouraged to test this version prior to release:

pip install featuretools==1.0.0rc1

For details on migrating to the new version, refer to Transitioning to Featuretools Version 1.0. Please report any issues in the Featuretools GitHub repo or by messaging in Alteryx Open Source Slack.


featuretools.demo.load_retail

featuretools.demo.load_retail(id='demo_retail_data', nrows=None, return_single_table=False)[source]

Returns the retail entityset example. The original dataset can be found here.

We have also made some modifications to the data. We changed the column names, converted the customer_id to a unique fake customer_name, dropped duplicates, added columns for total and cancelled and converted amounts from GBP to USD. You can download the modified CSV in gz compressed (7 MB) or uncompressed (43 MB) formats.

Parameters
  • id (str) – Id to assign to EntitySet.

  • nrows (int) – Number of rows to load of the underlying CSV. If None, load all.

  • return_single_table (bool) – If True, return a CSV rather than an EntitySet. Default is False.

Examples

In [1]: import featuretools as ft

In [2]: es = ft.demo.load_retail()

In [3]: es
Out[3]: 
Entityset: demo_retail_data
  Entities:
    orders (shape = [22190, 3])
    products (shape = [3684, 3])
    customers (shape = [4372, 2])
    order_products (shape = [401704, 7])

Load in subset of data

In [4]: es = ft.demo.load_retail(nrows=1000)

In [5]: es
Out[5]: 
Entityset: demo_retail_data
  Entities:
    orders (shape = [67, 5])
    products (shape = [606, 3])
    customers (shape = [50, 2])
    order_products (shape = [1000, 7])