class featuretools.EntitySet(id=None, entities=None, relationships=None)

Stores all actual data for a entityset

__init__(id=None, entities=None, relationships=None)

Creates EntitySet

  • id (str) – Unique identifier to associate with this instance
  • entities (dict[str -> tuple(pd.DataFrame, str, str)]) – Dictionary of entities. Entries take the format {entity id -> (dataframe, id column, (time_column), (variable_types))}. Note that time_column and variable_types are optional.
  • relationships (list[(str, str, str, str)]) – List of relationships between entities. List items are a tuple with the format (parent entity id, parent variable, child entity id, child variable).


entities = {
    "cards" : (card_df, "id"),
    "transactions" : (transactions_df, "id", "transaction_time")

relationships = [("cards", "id", "transactions", "card_id")]

ft.EntitySet("my-entity-set", entities, relationships)


__init__([id, entities, relationships]) Creates EntitySet
add_interesting_values([max_values, verbose]) Find interesting values for categorical variables, to be used to generate “where” clauses
add_last_time_indexes([updated_entities]) Calculates the last time index values for each entity (the last time an instance or children of that instance were observed).
add_relationship(relationship) Add a new relationship between entities in the entityset
add_relationships(relationships) Add multiple new relationships to a entityset
concat(other[, inplace]) Combine entityset with another to create a new entityset with the combined data of both entitysets.
entity_from_dataframe(entity_id, dataframe) Load the data for a specified entity from a Pandas DataFrame.
find_backward_path(start_entity_id, …) Find a backward path between a start and goal entity
find_forward_path(start_entity_id, …) Find a forward path between a start and goal entity
find_path(start_entity_id, goal_entity_id[, …]) Find a path in the entityset represented as a DAG
from_metadata(metadata[, data_root])
get_backward_entities(entity_id[, deep]) Get entities that are in a backward relationship with entity
get_backward_relationships(entity_id) get relationships where entity “entity_id” is the parent.
get_forward_entities(entity_id[, deep]) Get entities that are in a forward relationship with entity
get_forward_relationships(entity_id) Get relationships where entity “entity_id” is the child
get_pandas_data_slice(filter_entity_ids, …) Get the slice of data related to the supplied instances of the index entity.
normalize_entity(base_entity_id, …[, …]) Create a new entity and relationship from unique values of an existing variable.
path_relationships(path, start_entity_id) Generate a list of the strings “forward” or “backward” corresponding to the direction of the relationship at each point in path.
plot([to_file]) Create a UML diagram-ish graph of the EntitySet.
related_instances(start_entity_id, …[, …]) Filter out all but relevant information from dataframes along path from start_entity_id to final_entity_id, exclude data if it does not lie within and time_last
to_parquet(path) Write entityset to disk in the parquet format, location specified by path.
to_pickle(path) Write entityset to disk in the pickle format, location specified by path.


is_metadata Returns True if all of the Entity’s contain no data (empty DataFrames).
metadata Version of this EntitySet with all data replaced with empty DataFrames.