featuretools.Entity¶
-
class
featuretools.
Entity
(id, df, entityset, variable_types=None, index=None, time_index=None, secondary_time_index=None, last_time_index=None, already_sorted=False, make_index=False, verbose=False)[source]¶ Represents an entity in a Entityset, and stores relevant metadata and data
An Entity is analogous to a table in a relational database
See also
Relationship
,Variable
,EntitySet
-
__init__
(id, df, entityset, variable_types=None, index=None, time_index=None, secondary_time_index=None, last_time_index=None, already_sorted=False, make_index=False, verbose=False)[source]¶ Create Entity
- Parameters
id (str) – Id of Entity.
df (pd.DataFrame) – Dataframe providing the data for the entity.
entityset (EntitySet) – Entityset for this Entity.
variable_types (dict[str -> type/str/dict[str -> type]]) – An entity’s variable_types dict maps string variable ids to types (
Variable
) or type_string (str) or (type, kwargs) to pass keyword arguments to the Variable.index (str) – Name of id column in the dataframe.
time_index (str) – Name of time column in the dataframe.
secondary_time_index (dict[str -> str]) – Dictionary mapping columns in the dataframe to the time index column they are associated with.
last_time_index (pd.Series) – Time index of the last event for each instance across all child entities.
make_index (bool, optional) – If True, assume index does not exist as a column in dataframe, and create a new column of that name using integers the (0, len(dataframe)). Otherwise, assume index exists in dataframe.
Methods
__init__
(id, df, entityset[, …])Create Entity
add_interesting_values
([max_values, verbose])Find interesting values for categorical variables, to be used to
convert_variable_type
(variable_id, new_type)Convert variable in dataframe to different type
delete_variables
(variable_ids)Remove variables from entity’s dataframe and from self.variables
query_by_values
(instance_vals[, …])Query instances that have variable with given value
set_index
(variable_id[, unique])- param variable_id
Name of an existing variable to set as index.
set_secondary_time_index
(secondary_time_index)set_time_index
(variable_id[, already_sorted])update_data
(df[, already_sorted, …])Update entity’s internal dataframe, optionaly making sure data is sorted, reference indexes to other entities are consistent, and last_time_indexes are consistent.
Attributes
df
Dataframe providing the data for the entity.
last_time_index
Time index of the last event for each instance across all child entities.
shape
Shape of the entity’s dataframe
variable_types
Dictionary mapping variable id’s to variable types
-