pymove.core package

Submodules

pymove.core.dask module

DaskMoveDataFrame class.

class pymove.core.dask.DaskMoveDataFrame(data: DataFrame | list | dict, latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime', traj_id: str = 'id', n_partitions: int = 1)[source]

Bases: dask.dataframe.core.DataFrame, pymove.core.interface.MoveDataFrameAbstractModel

PyMove dataframe extending Dask DataFrame.

all(*args, **kwargs)[source]

Indicates if all elements are True, potentially over an axis.

any(*args, **kwargs)[source]

Indicates if any element is True, potentially over an axis.

append(*args, **kwargs)[source]

Append rows of other to the end of caller, returning a new object.

astype(*args, **kwargs)[source]

Casts a dask object to a specified dtype.

at

Access a single value for a row/column label pair.

columns

The column labels of the DataFrame.

convert_to(new_type: str) → MoveDataFrame | 'PandasMoveDataFrame' | 'DaskMoveDataFrame'[source]

Convert an object from one type to another specified by the user.

Parameters:new_type ('pandas' or 'dask') – The type for which the object will be converted.
Returns:The converted object.
Return type:A subclass of MoveDataFrameAbstractModel
copy(*args, **kwargs)[source]

Make a copy of this object’srs indices and data.

count(*args, **kwargs)[source]

Counts the non-NA cells for each column or row.

datetime

Checks for the DATETIME column and returns its value.

Returns:DATETIME column
Return type:Series
Raises:AttributeError – If the DATETIME column is not present in the DataFrame
describe(*args, **kwargs)[source]

Generate descriptive statistics.

drop(*args, **kwargs)[source]

Drops specified rows or columns of the dask Dataframe.

drop_duplicates(*args, **kwargs)[source]

Removes duplicated rows from the data.

dropna(*args, **kwargs)[source]

Removes missing data from dask DataFrame.

dtypes

Return the dtypes in the DataFrame.

duplicated(*args, **kwargs)[source]

Returns boolean Series denoting duplicate rows.

fillna(*args, **kwargs)[source]

Fills missing data in the dask DataFrame.

generate_date_features(*args, **kwargs)[source]

Create or update date feature.

generate_datetime_in_format_cyclical(*args, **kwargs)[source]

Create or update column with cyclical datetime feature.

generate_day_of_the_week_features(*args, **kwargs)[source]

Create or update a feature day of the week from datatime.

generate_dist_features(*args, **kwargs)[source]

Create the three distance in meters to an GPS point P.

generate_dist_time_speed_features(*args, **kwargs)[source]

Creates features of distance, time and speed between points.

generate_hour_features(*args, **kwargs)[source]

Create or update hour feature.

generate_move_and_stop_by_radius(*args, **kwargs)[source]

Create or update column with move and stop points by radius.

generate_speed_features(*args, **kwargs)[source]

Create the three speed in meters by seconds to an GPS point P.

generate_tid_based_on_id_datetime(*args, **kwargs)[source]

Create or update trajectory id based on id e datetime.

generate_time_features(*args, **kwargs)[source]

Create the three time in seconds to an GPS point P.

generate_time_of_day_features(*args, **kwargs)[source]

Create a feature time of day or period from datatime.

generate_weekend_features(*args, **kwargs)[source]

Create or update the feature weekend to the dataframe.

get_bbox(*args, **kwargs)[source]

Creates the bounding box of the trajectories.

get_type() → str[source]

Returns the type of the object.

Returns:A string representing the type of the object.
Return type:str
get_users_number(*args, **kwargs)[source]

Check and return number of users in trajectory data.

groupby(*args, **kwargs)[source]

Groups dask DataFrame using a mapper or by a Series of columns.

head(n: int = 5, npartitions: int = 1, compute: bool = True) → dask.dataframe.core.DataFrame[source]

Return the first n rows.

This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Parameters:
  • n (int, optional, default 5) – Number of rows to select.
  • npartitions (int, optional, default 1.) – Represents the number partitions.
  • compute (bool, optional, default True.) – Wether to perform the operation
Returns:

The first n rows of the caller object.

Return type:

same type as caller

iloc

Purely integer-location based indexing for selection by position.

index

The row labels of the DataFrame.

info(*args, **kwargs)[source]

Print a concise summary of a DataFrame.

isin(*args, **kwargs)[source]

Determines whether each element is contained in values.

isna(*args, **kwargs)[source]

Detect missing values.

join(*args, **kwargs)[source]

Join columns of another DataFrame.

lat

Checks for the LATITUDE column and returns its value.

Returns:LATITUDE column
Return type:Series
Raises:AttributeError – If the LATITUDE column is not present in the DataFrame
len(*args, **kwargs)[source]

Returns the length/row numbers in trajectory data.

lng

Checks for the LONGITUDE column and returns its value.

Returns:LONGITUDE column
Return type:Series
Raises:AttributeError – If the LONGITUDE column is not present in the DataFrame
loc

Access a group of rows and columns by label(srs) or a boolean array.

max(*args, **kwargs)[source]

Return the maximum of the values for the requested axis.

memory_usage(*args, **kwargs)[source]

Return the memory usage of each column in bytes.

merge(*args, **kwargs)[source]

Merge columns of another DataFrame.

min(*args, **kwargs)[source]

Return the minimum of the values for the requested axis.

nunique(*args, **kwargs)[source]

Count distinct observations over requested axis.

plot(*args, **kwargs)[source]

Plot the data of the dask DataFrame.

plot_all_features(*args, **kwargs)[source]

Generate a visualization for each column that type is equal dtype.

plot_traj_id(*args, **kwargs)[source]

Generate a visualization for a trajectory with the specified tid.

plot_trajs(*args, **kwargs)[source]

Generate a visualization that show trajectories.

rename(*args, **kwargs)[source]

Alter axes labels..

reset_index(*args, **kwargs)[source]

Resets the dask DataFrame’srs index, and use the default one.

sample(*args, **kwargs)[source]

Samples data from the dask DataFrame.

select_dtypes(*args, **kwargs)[source]

Returns a subset of the columns based on the column dtypes.

set_index(*args, **kwargs)[source]

Set of row labels using one or more existing columns or arrays.

shape

Return a tuple representing the dimensionality of the DataFrame.

shift(*args, **kwargs)[source]

Shifts by desired number of periods with an optional time freq.

show_trajectories_info(*args, **kwargs)[source]

Show dataset information from dataframe.

sort_values(*args, **kwargs)[source]

Sorts the values of the dask DataFrame.

tail(n: int = 5, npartitions: int = 1, compute: bool = True) → dask.dataframe.core.DataFrame[source]

Return the last n rows.

This function returns the last n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Parameters:
  • n (int, optional, default 5) – Number of rows to select.
  • npartitions (int, optional, default 1.) – Represents the number partitions.
  • compute (bool, optional, default True.) –

    ?

Returns:

The last n rows of the caller object.

Return type:

same type as caller

time_interval(*args, **kwargs)[source]

Get time difference between max and min datetime in trajectory.

to_csv(*args, **kwargs)[source]

Write object to a comma-separated values (csv) file.

to_data_frame() → dask.dataframe.core.DataFrame[source]

Converts trajectory data to DataFrame format.

Returns:Represents the trajectory in DataFrame format.
Return type:dask.dataframe.DataFrame
to_dict(*args, **kwargs)[source]

Converts trajectory data to dict format.

to_grid(*args, **kwargs)[source]

Converts trajectory data to grid format.

to_numpy(*args, **kwargs)[source]

Converts trajectory data to numpy array format.

unique(*args, **kwargs)[source]

Return unique values of Series object.

values

Return a Numpy representation of the DataFrame.

write_file(*args, **kwargs)[source]

Write trajectory data to a new file.

pymove.core.dataframe module

MoveDataFrame class.

class pymove.core.dataframe.MoveDataFrame[source]

Bases: object

Auxiliary class to check and transform data into Pymove Dataframes.

static format_labels(current_id: str, current_lat: str, current_lon: str, current_datetime: str) → dict[source]

Format the labels for the PyMove lib pattern labels output lat, lon and datatime.

Parameters:
  • current_id (str) – Represents the column name of feature id
  • current_lat (str) – Represents the column name of feature latitude
  • current_lon (str) – Represents the column name of feature longitude
  • current_datetime (str) – Represents the column name of feature datetime
Returns:

Represents a dict with mapping current columns of data to format of PyMove column.

Return type:

Dict

static has_columns(data: pandas.core.frame.DataFrame) → bool[source]

Checks whether the received dataset has ‘lat’, ‘lon’, ‘datetime’ columns.

Parameters:data (DataFrame) – Input trajectory data
Returns:Represents whether or not you have the required columns
Return type:bool
static validate_move_data_frame(data: pandas.core.frame.DataFrame)[source]

Converts the column type to the default type used by PyMove lib.

Parameters:

data (DataFrame) – Input trajectory data

Raises:
  • KeyError – If missing one of lat, lon, datetime columns
  • ValueError, ParserError – If the data types can’t be converted

pymove.core.grid module

Grid class.

class pymove.core.grid.Grid(data: DataFrame | dict, cell_size: float | None = None, meters_by_degree: float | None = None)[source]

Bases: object

PyMove class representing a grid.

convert_one_index_grid_to_two(data: pandas.core.frame.DataFrame, label_grid_index: str = 'index_grid')[source]

Converts grid lat-lon ids to unique values.

Parameters:
  • data (DataFrame) – Dataframe with grid lat-lon ids
  • label_grid_index (str, optional) – grid unique id column, by default INDEX_GRID
convert_two_index_grid_to_one(data: pandas.core.frame.DataFrame, label_grid_lat: str = 'index_grid_lat', label_grid_lon: str = 'index_grid_lon')[source]

Converts grid lat-lon ids to unique values.

Parameters:
  • data (DataFrame) – Dataframe with grid lat-lon ids
  • label_grid_lat (str, optional) – grid lat id column, by default INDEX_GRID_LAT
  • label_grid_lon (str, optional) – grid lon id column, by default INDEX_GRID_LON
create_all_polygons_on_grid()[source]

Create all polygons that are represented in a grid.

Stores the polygons in the grid_polygon key

create_all_polygons_to_all_point_on_grid(data: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]

Create all polygons to all points represented in a grid.

Parameters:data (DataFrame) – Represents the dataset with contains lat, long and datetime
Returns:Represents the same dataset with new key ‘polygon’ where polygons were saved.
Return type:DataFrame
create_one_polygon_to_point_on_grid(index_grid_lat: int, index_grid_lon: int) → shapely.geometry.polygon.Polygon[source]

Create one polygon to point on grid.

Parameters:
  • index_grid_lat (int) – Represents index of grid that reference latitude.
  • index_grid_lon (int) – Represents index of grid that reference longitude.
Returns:

Represents a polygon of this cell in a grid.

Return type:

Polygon

create_update_index_grid_feature(data: pandas.core.frame.DataFrame, unique_index: bool = True, label_dtype: Callable = <class 'numpy.int64'>, sort: bool = True)[source]

Create or update index grid feature.

It is not necessary pass dic_grid, because it creates a dic_grid if not provided.

Parameters:
  • data (DataFrame) – Represents the dataset with contains lat, long and datetime.
  • unique_index (bool, optional) – How to index the grid, by default True
  • label_dtype (Callable, optional) – Represents the type of a value of new column in dataframe, by default np.int64
  • sort (bool, optional) – Represents if needs to sort the dataframe, by default True
get_grid() → dict[source]

Returns the grid object in a dict format.

Returns:
Dict with grid information
’lon_min_x’: minimum x of grid, ‘lat_min_y’: minimum y of grid, ‘grid_size_lat_y’: lat y size of grid, ‘grid_size_lon_x’: lon x size of grid, ‘cell_size_by_degree’: cell size in radians
Return type:Dict
point_to_index_grid(event_lat: float, event_lon: float) → tuple[int, int][source]

Locate the coordinates x and y in a grid of point (lat, long).

Parameters:
  • event_lat (float) – Represents the latitude of a point
  • event_lon (float) – Represents the longitude of a point
Returns:

Represents the index y in a grid of a point (lat, long) Represents the index x in a grid of a point (lat, long)

Return type:

Tuple[int, int]

read_grid_pkl(filename: str) → Grid[source]

Read grid dict from a file .pkl.

Parameters:filename (str) – Represents the name of a file.
Returns:Grid object containing informations about virtual grid
Return type:Grid
save_grid_pkl(filename: str)[source]

Save a grid with new file .pkl.

Parameters:filename (Text) – Represents the name of a file.

pymove.core.interface module

class pymove.core.interface.MoveDataFrameAbstractModel[source]

Bases: abc.ABC

all()[source]
any()[source]
append()[source]
astype()[source]
at()[source]
columns()[source]
convert_to(new_type: str)[source]
copy()[source]
count()[source]
datetime()[source]
describe()[source]
drop()[source]
drop_duplicates()[source]
dropna()[source]
dtypes()[source]
duplicated()[source]
fillna()[source]
generate_date_features()[source]
generate_datetime_in_format_cyclical()[source]
generate_day_of_the_week_features()[source]
generate_dist_features()[source]
generate_dist_time_speed_features()[source]
generate_hour_features()[source]
generate_move_and_stop_by_radius()[source]
generate_speed_features()[source]
generate_tid_based_on_id_datetime()[source]
generate_time_features()[source]
generate_time_of_day_features()[source]
generate_weekend_features()[source]
get_bbox()[source]
get_type()[source]
get_users_number()[source]
groupby()[source]
head()[source]
iloc()[source]
index()[source]
info()[source]
isin()[source]
isna()[source]
join()[source]
lat()[source]
len()[source]
lng()[source]
loc()[source]
max()[source]
memory_usage()[source]
merge()[source]
min()[source]
nunique()[source]
plot()[source]
plot_all_features()[source]
plot_traj_id()[source]
plot_trajs()[source]
rename()[source]
reset_index()[source]
sample()[source]
select_dtypes()[source]
set_index()[source]
shape()[source]
shift()[source]
show_trajectories_info()[source]
sort_values()[source]
tail()[source]
time_interval()[source]
to_csv()[source]
to_data_frame()[source]
to_dict()[source]
to_grid()[source]
to_numpy()[source]
values()[source]
write_file()[source]

pymove.core.pandas module

PandasMoveDataFrame class.

class pymove.core.pandas.PandasMoveDataFrame(data: DataFrame | list | dict, latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime', traj_id: str = 'id')[source]

Bases: pandas.core.frame.DataFrame

PyMove dataframe extending Pandas DataFrame.

append(other: 'PandasMoveDataFrame' | DataFrame, ignore_index: bool = False, verify_integrity: bool = False, sort: bool = False) → 'PandasMoveDataFrame'[source]

Append rows of other to the end of caller, returning a new object.

Columns in other that are not in the caller are added as new columns.

Parameters:
  • other (DataFrame or Series/dict-like object, or list of these) – The data to append.
  • ignore_index (bool, optional) – If True, do not use the index labels, by default False
  • verify_integrity (bool, optional) – If True, raise ValueError on creating index with duplicates, by default False
  • sort (bool, optional) – Sort columns if the columns of self and other are not aligned The default sorting is deprecated and will change to not-sorting in a future version of pandas. by default False
Returns:

A dataframe containing rows from both the caller and other.

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html

astype(dtype: Callable | dict, copy: bool = True, errors: str = 'raise') → DataFrame[source]

Cast a pandas object to a specified dtype.

Parameters:
  • dtype (callable, dict) – Use a numpy.dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.dtype or Python type to cast one or more of the DataFrame columns to column-specific types.
  • copy (bool, optional) – Return a copy when copy=True (be very careful setting copy=False as changes to values then may propagate to other pandas objects), by default True
  • errors (str, optional) –
    Control raising of exceptions on invalid data for provided dtype,
    by default ‘raise
    • raise : allow exceptions to be raised
    • ignore : suppress exceptions. On error return original object
Returns:

Casted object to specified type.

Return type:

DataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.astype.html

Raises:AttributeError – If trying to change required types inplace
convert_to(new_type: str) → MoveDataFrame | 'PandasMoveDataFrame' | 'DaskMoveDataFrame'[source]

Convert an object from one type to another specified by the user.

Parameters:new_type ('pandas' or 'dask') – The type for which the object will be converted.
Returns:The converted object.
Return type:A subclass of MoveDataFrameAbstractModel
copy(deep: bool = True) → PandasMoveDataFrame[source]

Make a copy of this object’s indices and data.

When deep=True (default), a new object will be created with a copy of the calling object data and indices. Modifications to the data or indices of the copy will not be reflected in the original object (see notes below). When deep=False, a new object will be created without copying the calling object data or index (only references to the data and index are copied). Any changes to the data of the original will be reflected in the shallow copy (and vice versa).

Parameters:deep (bool, optional) – Make a deep copy, including a copy of the data and the indices. With deep=False neither the indices nor the data are copied, by default True
Returns:Object type matches caller.
Return type:PandasMoveDataFrame

Notes

When deep=True, data is copied but actual Python objects will not be copied recursively, only the reference to the object. This is in contrast to copy.deepcopy in the Standard Library, which recursively copies object data (see examples below). While Index objects are copied when deep=True, the underlying numpy array is not copied for performance reasons. Since Index is immutable, the underlying data can be safely shared and a copy is not needed.

datetime

Checks for the DATETIME column and returns its value.

Returns:DATETIME column
Return type:Series
Raises:AttributeError – If the DATETIME column is not present in the DataFrame
drop(labels: str | list[str] | None = None, axis: int | str = 0, index: str | list[str] | None = None, columns: str | list[str] | None = None, level: int | str | None = None, inplace: bool = False, errors: str = 'raise') → 'PandasMoveDataFrame' | DataFrame | None[source]

Removes rows or columns.

By specifying label names and corresponding axis, or by specifying directly index or column names. When using a multiindex, labels on different levels can be removed by specifying the level.

Parameters:
  • labels (str or list, optional) – Index or column labels to drop, by default None
  • axis (int or str, optional) – Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’), by default 0
  • index (str or list, optional) – Alternative to specifying axis (labels, axis=0 is equivalent to index=labels), by default None
  • columns (str or list, optional) – Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels), by default None
  • level (str or int, optional) – For MultiIndex, level from which the labels will be removed, by default None
  • inplace (bool, optional) – If True, do operation inplace and return None Otherwise, make a copy, do operations and return, by default False
  • errors (bool, optional) – ‘ignore’, ‘raise’, by default ‘raise’ If ‘ignore’, suppress error and only existing labels are dropped.
Returns:

Object without the removed index or column labels or None

Return type:

PandasMoveDataFrame, DataFrame

Raises:
  • AttributeError – If trying to drop a required column inplace
  • KeyError – If any of the labels is not found in the selected axis.

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html

drop_duplicates(subset: int | str | None = None, keep: str | bool = 'first', inplace: bool = False) → 'PandasMoveDataFrame' | None[source]

Uses the pandas’s function drop_duplicates, to remove duplicated rows from data.

Parameters:
  • subset (int or str, optional) – Only consider certain columns for identifying duplicates, by default use all of the columns, by default None
  • keep (str, optional) –
    • first : Drop duplicates except for the first occurrence.
    • last : Drop duplicates except for the last occurrence.
    • False : Drop all duplicates.

    by default ‘first’

  • inplace (bool, optional) – Whether to drop duplicates in place or to return a copy, by default False
Returns:

Object with duplicated rows or None

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html

dropna(axis: int | str = 0, how: str = 'any', thresh: float | None = None, subset: list | None = None, inplace: bool = False)[source]

Removes missing data.

Parameters:
  • axis (0 or 'index', 1 or 'columns', None, optional) – Determine if rows or columns are removed, by default 0 - 0, or ‘index’ : Drop rows which contain missing values. - 1, or ‘columns’ : Drop columns which contain missing value.
  • how (str, optional) –

    Determine if row or column is removed from DataFrame, by default ‘any when we have at least one NA or all NA.

    • ’any’ : If any NA values are present, drop that row or column.
    • ’all’ : If all values are NA, drop that row or column.
  • thresh (float, optional) – Require that many non-NA values, by default None
  • subset (array-like, optional) – Labels along other axis to consider, by default None e.g. if you are dropping rows these would be a list of columns to include.
  • inplace (bool, optional) – If True, do operation inplace and return None, by default False
Returns:

Object with NA entries dropped or None

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html

Raises:AttributeError – If trying to drop required columns inplace
fillna(value: Any | None = None, method: str | None = None, axis: int | str | None = None, inplace: bool = False, limit: int | None = None, downcast: dict | None = None)[source]

Fill NA/NaN values using the specified method.

Parameters:
  • value (scalar, dict, Series, or DataFrame) – Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame). Values not in the dict/Series/DataFrame will not be filled. This value cannot be a list.
  • method ({'backfill', 'bfill', 'pad', 'ffill', None}, default None) – Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use next valid observation to fill gap.
  • axis ({0 or 'index', 1 or 'columns'}) – Axis along which to fill missing values.
  • inplace (bool, default False) – If True, fill in-place. Note: this will modify any other views on this object (e.g., a no-copy slice for a column in a DataFrame).
  • limit (int, default None) – If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None.
  • downcast (dict, default is None) – A dict of item->dtype of what to downcast if possible, or the str ‘infer’ which will try to downcast to an appropriate equal type (e.g. float64 to int64 if possible).
Returns:

Object with missing values filled or None

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

generate_date_features(inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create or update date feature based on datetime.

Parameters:inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:Object with new features or None
Return type:PandasMoveDataFrame
generate_datetime_in_format_cyclical(label_datetime: str = 'datetime', inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create or update column with cyclical datetime feature.

Parameters:
  • label_datetime (str, optional) – Represents column id type, by default DATETIME
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

References

https://ianlondon.github.io/blog/encoding-cyclical-features-24hour-time/ https://www.avanwyk.com/encoding-cyclical-features-for-deep-learning/

generate_day_of_the_week_features(inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create or update day of the week features based on datetime.

Parameters:inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:Object with new features or None
Return type:PandasMoveDataFrame
generate_dist_features(label_id: str = 'id', label_dtype: Callable = <class 'numpy.float64'>, sort: bool = True, inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create the three distance in meters to an GPS point P.

Parameters:
  • label_id (str, optional) – Represents name of column of trajectories id, by default TRAJ_ID
  • label_dtype (callable, optional) – Represents column id type, by default np.float64
  • sort (bool, optional) – If sort == True the dataframe will be sorted, by True
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

Examples

  • P to P.next = 2 meters
  • P to P.previous = 1 meter
  • P.previous to P.next = 1 meters
generate_dist_time_speed_features(label_id: str = 'id', label_dtype: Callable = <class 'numpy.float64'>, sort: bool = True, inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Adds distance, time and speed information to the dataframe.

Firstly, create the three distance to an GPS point P (lat, lon). After, create two time features to point P: time to previous and time to next. Lastly, create two features to speed using time and distance features.

Parameters:
  • label_id (str, optional) – Represents name of column of trajectories id, by default TRAJ_ID
  • label_dtype (callable, optional) – Represents column id type, by default np.float64
  • sort (bool, optional) – If sort == True the dataframe will be sorted, by True
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

Examples

  • dist_to_prev = 248.33 meters, dist_to_prev 536.57 meters
  • time_to_prev = 60 seconds, time_prev = 60.0 seconds
  • speed_to_prev = 4.13 m/srs, speed_prev = 8.94 m/srs.
generate_hour_features(inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create or update hour features based on datetime.

Parameters:inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:Object with new features or None
Return type:PandasMoveDataFrame
generate_move_and_stop_by_radius(radius: float = 0, target_label: str = 'dist_to_prev', inplace: bool = True)[source]

Create or update column with move and stop points by radius.

Parameters:
  • radius (float, optional) – Represents radius, by default 0
  • target_label (str, optional) – Represents column to compute, by default DIST_TO_PREV
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

generate_speed_features(label_id: str = 'id', label_dtype: Callable = <class 'numpy.float64'>, sort: bool = True, inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create the three speed in meter by seconds to an GPS point P.

Parameters:
  • label_id (str, optional) – Represents name of column of trajectories id, by default TRAJ_ID
  • label_dtype (callable, optional) – Represents column id type, by default np.float64
  • sort (bool, optional) – If sort == True the dataframe will be sorted, by True
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

Raises:

ValueError – If feature generation fails

Examples

  • P to P.next = 1 meter/seconds
  • P to P.previous = 3 meter/seconds
  • P.previous to P.next = 2 meter/seconds
generate_tid_based_on_id_datetime(str_format: str = '%Y%m%d%H', sort: bool = True, inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create or update trajectory id based on id and datetime.

Parameters:
  • str_format (str, optional) – Format to consider the datetime, by default ‘%Y%m%d%H’
  • sort (bool, optional) – Wether to sort the dataframe, by default True
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

generate_time_features(label_id: str = 'id', label_dtype: Callable = <class 'numpy.float64'>, sort: bool = True, inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create the three time in seconds to an GPS point P.

Parameters:
  • label_id (str, optional) – Represents name of column of trajectories id, by default TRAJ_ID
  • label_dtype (callable, optional) – Represents column id type, by default np.float64
  • sort (bool, optional) – If sort == True the dataframe will be sorted, by True
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

Examples

  • P to P.next = 5 seconds
  • P to P.previous = 15 seconds
  • P.previous to P.next = 20 seconds
generate_time_of_day_features(inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Create or update time of day features based on datetime.

Parameters:inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:
Object with new features or None
Early morning from 0H to 6H Morning from 6H to 12H Afternoon from 12H to 18H Evening from 18H to 24H
Return type:PandasMoveDataFrame

Examples

  • datetime1 = 2019-04-28 02:00:56 -> period = Early Morning
  • datetime2 = 2019-04-28 08:00:56 -> period = Morning
  • datetime3 = 2019-04-28 14:00:56 -> period = Afternoon
  • datetime4 = 2019-04-28 20:00:56 -> period = Evening
generate_weekend_features(create_day_of_week: bool = False, inplace: bool = True) → 'PandasMoveDataFrame' | None[source]

Adds information to rows determining if it is a weekend day.

Create or update the feature weekend to the dataframe, if this resource indicates that the given day is the weekend, otherwise, it is a day of the week.

Parameters:
  • create_day_of_week (bool, optional) – Indicates if the column day should be keeped in the dataframe. If set to False the column will be dropped, by default False
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasMoveDataFrame

get_bbox() → tuple[float, float, float, float][source]

Returns the bounding box of the dataframe.

A bounding box (usually shortened to bbox) is an area defined by two longitudes and two latitudes, where:

  • Latitude is a decimal number between -90.0 and 90.0.
  • Longitude is a decimal number between -180.0 and 180.0.

They usually follow the standard format of: - bbox = left, bottom, right, top - bbox = min Longitude , min Latitude , max Longitude , max Latitude

Returns:Represents a bound box, that is a tuple of 4 values with the min and max limits of latitude e longitude. lat_min, lon_min, lat_max, lon_max
Return type:Tuple[float, float, float, float]

Examples

(22.147577, 113.54884299999999, 41.132062, 121.156224)

get_type() → str[source]

Returns the type of the object.

Returns:A string representing the type of the object.
Return type:str
get_users_number() → int[source]

Check and return number of users in trajectory data.

Returns:Represents the number of users in trajectory data.
Return type:int
head(n: int = 5) → PandasMoveDataFrame[source]

Return the first n rows.

This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Parameters:n (int, optional) – Number of rows to select, by default 5
Returns:The first n rows of the caller object.
Return type:PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html

isin(values: list | Series | DataFrame | dict) → DataFrame[source]

Determines whether each element in the DataFrame is contained in values.

values : iterable, Series, DataFrame or dict
The result will only be true at a location if all the labels match. If values is a Series, the index. If values is a dict, the keys must be the column names, which must match. If values is a DataFrame, then both the index and column labels must match.
Returns:DataFrame of booleans showing whether each element in the DataFrame is contained in values
Return type:DataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html

join(other: 'PandasMoveDataFrame' | DataFrame, on: str | list | None = None, how: str = 'left', lsuffix: str = '', rsuffix: str = '', sort: bool = False) → 'PandasMoveDataFrame'[source]

Join columns of other, returning a new object.

Join columns with other PandasMoveDataFrame either on index or on a key column. Efficiently join multiple DataFrame objects by index at once by passing a list.

Parameters:
  • other (DataFrame, Series, or list of DataFrame) – Index should be similar to one of the columns in this one. If a Series is passed, its name attribute must be set, and that will be used as the column name in the resulting joined DataFrame.
  • on (str or list of str or array-like, optional) – Column or index level name(srs) in the caller to join on the index in other, otherwise joins index-on-index. If multiple values given, the other DataFrame must have a MultiIndex. Can pass an array as the join key if it is not already contained in the calling DataFrame. Like an Excel VLOOKUP operation.
  • how ({'left', 'right', 'outer', 'inner'}, optional) –

    How to handle the operation of the two objects, by default ‘left’

    • left: use calling frame index (or column if on is specified)
    • right: use other index.
    • outer: form union of calling frame index (or column if on is

    specified) with other index, and sort it. lexicographically. * inner: form intersection of calling frame index (or column if on is specified) with other index, preserving the order of the calling one.

  • lsuffix (str, optional) – Suffix to use from left frame overlapping columns, by default ‘’
  • rsuffix (str, optional) – Suffix to use from right frame overlapping columns, by default ‘’
  • sort (bool, optional) – Order result DataFrame lexicographically by the join key. If False, the order of the join key depends on the join type (how keyword)
Returns:

A dataframe containing columns from both the caller and other.

Return type:

PandasMoveDataFrame

Notes

Parameters on, lsuffix, and rsuffix are not supported when passing a list of DataFrame objects.

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.join.html

lat

Checks for the LATITUDE column and returns its value.

Returns:LATITUDE column
Return type:Series
Raises:AttributeError – If the LATITUDE column is not present in the DataFrame
len() → int[source]

Returns the length/row numbers in trajectory data.

Returns:Represents the trajectory data length.
Return type:int
lng

Checks for the LONGITUDE column and returns its value.

Returns:LONGITUDE column
Return type:Series
Raises:AttributeError – If the LONGITUDE column is not present in the DataFrame
merge(right: 'PandasMoveDataFrame' | DataFrame | Series, how: str = 'inner', on: str | list | None = None, left_on: str | list | None = None, right_on: str | list | None = None, left_index: bool = False, right_index: bool = False, sort: bool = False, suffixes: tuple[str, str] = ('_x', '_y'), copy: bool = True, indicator: bool | str = False, validate: str | None = None) → 'PandasMoveDataFrame'[source]

Merge DataFrame or named Series objects with a database-style join.

The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored. Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on.

Parameters:
  • right (DataFrame or named Series) – Object to merge with.
  • how ({‘left’, ‘right’, ‘outer’, ‘inner’}, optional) –

    Type of merge to be performed, by default ‘inner’ left: use only keys from left frame, similar to a SQL left outer join;

    preserve key order.
    right: use only keys from right frame, similar to a SQL right outer join;
    preserve key order.
    outer: use union of keys from both frames, similar to a SQL full outer join;
    sort keys lexicographically.
    inner: use intersection of keys from both frames, similar to a SQL inner join;
    preserve the order of the left keys.
  • on (label or list, optional) – Column or index level names to join on. These must be found in both DataFrames. If on is None and not merging on indexes then this defaults to the intersection of the columns in both DataFrames, by default None
  • left_on (str or list or array-like, optional) – Column or index level names to join on in the left DataFrame. Can also be an array or list of arrays of the length of the left DataFrame. These arrays are treated as if they are columns, by default None
  • right_on (str or list or array-like, optional) – Column or index level names to join on in the right DataFrame. Can also be an array or list of arrays of the length of the right DataFrame. These arrays are treated as if they are columns, by default None
  • left_index (bool, optional) – Use the index from the left DataFrame as the join key(s), by default False If it is a MultiIndex, the number of keys in the other DataFrame (either the index or a number of columns) must match the number of levels.
  • right_index (bool, optional) – Use the index from the right DataFrame as the join key, by default False Same caveats as left_index.
  • sort (bool, optional) – Sort the join keys lexicographically in the result DataFrame, by default False If False, the order of the join keys depends on the join type (how keyword).
  • suffixes (tuple of (str, str), optional) – Suffix to apply to overlapping column names in the left and right side respectively. To raise an exception on overlapping columns use (False, False) by default (‘_x’, ‘_y’)
  • copy (bool, optional) – If False, avoid copy if possible, by default True
  • indicator (bool or str, optional) – If True, adds a column to output DataFrame called ‘_merge’ with information on the source of each row. If string, column with information on source of each row will be added to output DataFrame, and column will be named value of string. Information column is Categorical-type and takes on a value of ‘left_only’ for observations whose merge key only appears in ‘left’ DataFrame, ‘right_only’ for observations whose merge key only appears in ‘right’ DataFrame, and ‘both’ if the observation’s merge key is found in both. by default False
  • validate (str, optional) –

    If specified, checks if merge is of specified type, by default None ‘one_to_one’ or ‘1:1’: check if merge keys are unique in both

    left and right datasets.

    ’one_to_many’ or ‘1:m’: check if merge keys are unique in left dataset. ‘many_to_one’ or ‘m:1’: check if merge keys are unique in right dataset. ‘many_to_many’ or ‘m:m’: allowed, but does not result in checks.

Returns:

A DataFrame of the two merged objects.

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html?highlight=merge#pandas.DataFrame.merge

rename(mapper: dict | Callable | None = None, index: dict | Callable | None = None, columns: dict | Callable | None = None, axis: int | str | None = None, copy: bool = True, inplace: bool = False) → 'PandasMoveDataFrame' | DataFrame | None[source]

Alter axes labels.

Function / dict values must be unique (1-to-1). Labels not contained in a dict / Series will be left as-is. Extra labels listed don’t throw an error.

Parameters:
  • mapper (dict or function, optional) – Dict-like or functions transformations to apply to that axis’ values. Use either mapper and axis to specify the axis to target with mapper, or index and columns, by default None
  • index (dict or function, optional) – Alternative to specifying axis (mapper, axis=0 is equivalent to index=mapper), by default None
  • columns (dict or function, optional) – Alternative to specifying axis (mapper, axis=1 is equivalent to columns=mapper), by default None
  • axis (int or str, optional) – Axis to target with mapper. Can be either the axis name (‘index’, ‘columns’) or number (0, 1), by default None
  • copy (bool, optional) – Also copy underlying data, by default True
  • inplace (bool, optional) – Whether to return a new DataFrame. If True then value of copy is ignored, by default False
Returns:

DataFrame with the renamed axis labels or None

Return type:

PandasMoveDataFrame, DataFrame

Raises:

AttributeError – If trying to rename a required column inplace

reset_index(level: int | str | tuple | list | None = None, drop: bool = False, inplace: bool = False, col_level: int | str = 0, col_fill: str = '') → 'PandasMoveDataFrame' | None[source]

Resets the DataFrame’s index, and use the default one.

One or more levels can be removed, if the DataFrame has a MultiIndex.

Parameters:
  • level (int or str or tuple or list, optional) – Only the levels specify will be removed from the index If set to None, all levels are removed, by default None
  • drop (bool, optional) – Do not try to insert index into dataframe columns This resets the index to the default integer index, by default False
  • inplace (bool, optional) – Modify the DataFrame in place (do not create a new object), by default False
  • col_level (int or str, optional) – If the columns have multiple levels, determines which level the labels are inserted into, by default 0
  • col_fill (str, optional) – If the columns have multiple levels, determines how the other levels are named If None then the index name is repeated, by default ‘’
  • PandasMoveDataFrame – The generated picture or None

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reset_index.html

sample(n: int | None = None, frac: float | None = None, replace: bool = False, weights: str | list | None = None, random_state: int | None = None, axis: int | str | None = None) → 'PandasMoveDataFrame'[source]

Return a random sample of items from an axis of object.

You can use random_state for reproducibility.

Parameters:
  • n (int, optional) – Number of items from axis to return. Cannot be used with frac, by default None
  • frac (float, optional) – Fraction of axis items to return. Cannot be used with n, by deault None
  • replace (bool, optional) – Allow or disallow sampling of the same row more than once, by default False
  • weights (str or ndarray-like, optional) – If ‘None’ results in equal probability weighting. If passed a Series, will align with target object on index. Index values in weights not found in sampled object will be ignored and index values in sampled object not in weights will be assigned weights of zero. If called on a DataFrame, will accept the name of a column when axis = 0. Unless weights are a Series, weights must be same length as axis being sampled. If weights do not sum to 1, they will be normalized to sum to 1. Missing values in the weights column will be treated as zero. Infinite values not allowed. by default None
  • random_state (int or numpy.random.RandomState, optional) – Seed for the random number generator (if int), or numpy RandomState object,by default None
  • axis ({0 or 'index', 1 or 'columns', None}, optional) – Axis to sample. Accepts axis number or name. Default is stat axis for given data type (0 for Series and DataFrames), by default None
Returns:

A new object of same type as caller containing n items randomly sampled from the caller object.

Return type:

PandasMoveDataFrame

See also

numpy.random.choice()
Generates a random sample from a given 1-D numpy array.

Notes

If frac > 1, replacement should be set to True.

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html

set_index(keys: str | list[str], drop: bool = True, append: bool = False, inplace: bool = False, verify_integrity: bool = False) → 'PandasMoveDataFrame' | DataFrame | None[source]

Set the DataFrame index (row labels) using one or more existing columns or arrays.

Parameters:
  • keys (str, list) – label or array-like or list of labels/arrays This parameter can be either a single column key, a single array of the same length as the calling DataFrame, or a list containing an arbitrary combination of column keys and arrays
  • drop (bool, optional) – Delete columns to be used as the new index, by default True
  • append (bool, optional) – Whether to append columns to existing index, by default True
  • inplace (bool, optional) – Modify the DataFrame in place (do not create a new object), by default True
  • verify_integrity (bool, optional) – Check the new index for duplicates Otherwise defer the check until necessary Setting to False will improve the performance of this method, by default True
Returns:

Object with a new index or None

Return type:

PandasMoveDataFrame, DataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html

Raises:AttributeError – If trying to change required columns types
shift(periods: int = 1, freq: DateOffset | Timedelta | str | None = None, axis: int | str = 0, fill_value: Any | None = None) → 'PandasMoveDataFrame'[source]

Shift index by desired number of periods with an optional time freq.

Parameters:
  • periods (int, optional, default 1) – Number of periods to shift. Can be positive or negative.
  • freq (DateOffset or Timedelta or str, optional, default None) – Offset to use from the series module or time rule (e.g. ‘EOM’). If freq is specified then the index values are shifted but the data is not realigned. That is, use freq if you would like to extend the index when shifting and preserve the original data. When freq is not passed, shift the index without realigning the data. If freq is passed (in this case, the index must be date or datetime, or it will raise a NotImplementedError), the index will be increased using the periods and the freq.
  • axis (0 or 'index', 1 or 'columns', None, optional, default 0) – Shift direction.
  • fill_value (object, optional, default None) – The scalar value to use for newly introduced missing values. The default depends on the dtype of self. For numeric data, np.nan is used. For datetime, timedelta, or period data, etc. NaT is used. For extension dtypes, self.dtype.na_value is used.
Returns:

A copy of the original object, shifted.

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shift.html

show_trajectories_info()[source]

Show dataset information from dataframe.

Displays the number of rows, datetime interval, and bounding box.

Examples

====================== INFORMATION ABOUT DATASET ====================== Number of Points: 217654 Number of IDs objects: 2 Start Date:2008-10-23 05:53:05 End Date:2009-03-19 05:46:37 Bounding Box:(22.147577, 113.54884299999999, 41.132062, 121.156224) =======================================================================

sort_values(by: str | list[str], axis: int = 0, ascending: bool = True, inplace: bool = False, kind: str = 'quicksort', na_position: str = 'last') → 'PandasMoveDataFrame' | None[source]

Sorts the values of the _data, along an axis.

Parameters:
  • by (str, list) – Name or list of names to sort the _data by
  • axis (int, optional) – if set to 0 or ‘index’, will count for each column. if set to 1 or ‘columns’, will count for each row by default 0
  • ascending (bool, optional) – Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bool, must match the length, by default True
  • inplace (bool, optional) – if set to true the original dataframe will be altered, the duplicates will be dropped in place, otherwise the operation will be made in a copy, that will be returned, by default False
  • kind (str, optional) – Choice of sorting algorithm, ‘quicksort’, ‘mergesort’, ‘heapsort’ For DataFrames, this option is only applied when sorting on a single column or label, by default ‘quicksort’
  • na_position (str, optional) – ‘first’, ‘last’, by default ‘last If ‘first’ puts NaNs at the beginning; If last puts NaNs at the end.
Returns:

The sorted dataframe or None

Return type:

PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html

tail(n: int = 5) → PandasMoveDataFrame[source]

Return the last n rows.

This function returns the last n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Parameters:n (int, optional) – Number of rows to select, by default 5
Returns:The last n rows of the caller object.
Return type:PandasMoveDataFrame

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html

time_interval() → pandas._libs.tslibs.timedeltas.Timedelta[source]

Get time difference between max and min datetime in trajectory data.

Returns:Represents the time difference.
Return type:Timedelta
to_data_frame() → pandas.core.frame.DataFrame[source]

Converts trajectory data to DataFrame format.

Returns:Represents the trajectory in DataFrame format.
Return type:DataFrame
to_dicrete_move_df(local_label: str = 'local_label') → PandasMoveDataFrame[source]

Generate a discrete dataframe move.

Parameters:local_label (str, optional) – Represents the column name of feature local label, default LOCAL_LABEL
Returns:Represents an PandasMoveDataFrame discretized.
Return type:PandasDiscreteMoveDataFrame
to_grid(cell_size: float, meters_by_degree: float | None = None) → Grid[source]

Converts trajectory data to grid format.

Parameters:
  • cell_size (float) – Represents grid cell size.
  • meters_by_degree (float, optional) – Represents the corresponding meters of lat by degree, by default lat_meters(-3.71839)
Returns:

Represents the trajectory in grid format

Return type:

Grid

write_file(file_name: str, separator: str = ', ')[source]

Write trajectory data to a new file.

Parameters:
  • file_name (str) – Represents the filename.
  • separator (str, optional) – Represents the information separator in a new file, by default ‘,’

pymove.core.pandas_discrete module

PandasDiscreteMoveDataFrame class.

class pymove.core.pandas_discrete.PandasDiscreteMoveDataFrame(data: DataFrame | list | dict, latitude: str = 'lat', longitude: str = 'lon', datetime: str = 'datetime', traj_id: str = 'id', local_label: str = 'local_label')[source]

Bases: pymove.core.pandas.PandasMoveDataFrame

PyMove discrete dataframe extending PandasMoveDataFrame.

discretize_based_grid(region_size: int = 1000)[source]

Discrete space in cells of the same size, assigning a unique id to each cell.

Parameters:region_size (int, optional) – Size of grid cell, by default 1000
generate_prev_local_features(label_id: str = 'id', local_label: str = 'local_label', sort: bool = True, inplace: bool = True) → 'PandasDiscreteMoveDataFrame' | None[source]

Create a feature prev_local with the label of previous local to current point.

Parameters:
  • label_id (str, optional) – Represents name of column of trajectory id, by default TRAJ_ID
  • local_label (str, optional) –
    Indicates name of column of place labels on symbolic trajectory,
    by default LOCAL_LABEL
  • sort (bool, optional) – Wether the dataframe will be sorted, by default True
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasDiscreteMoveDataFrame

generate_tid_based_statistics(label_id: str = 'id', local_label: str = 'local_label', mean_coef: float = 1.0, std_coef: float = 1.0, statistics: DataFrame | None = None, label_tid_stat: str = 'tid_stat', drop_single_points: bool = False, inplace: bool = True) → 'PandasDiscreteMoveDataFrame' | None[source]

Splits the trajectories into segments based on time statistics for segments.

Parameters:
  • label_id (str, optional) – Represents name of column of trajectory id, by default TRAJ_ID
  • local_label (str, optional) –
    Indicates name of column of place labels on symbolic trajectory,
    by default LOCAL_LABEL
  • mean_coef (float, optional) – Multiplication coefficient of the mean time for the segment, by default 1.0
  • std_coef (float, optional) – Multiplication coefficient of sdt time for the segment, by default 1.0
  • statistics (DataFrame, optional) – Time Statistics of the pairwise local labels, by default None
  • label_tid_stat (str, optional) – The label of the column containing the ids of the formed segments. Is the new splitted id, by default TID_STAT
  • drop_single_points (bool, optional) – Wether to drop the trajectories with only one point, by default False
  • inplace (bool, optional) – Represents whether the operation will be performed on the data provided or in a copy, by default True
Returns:

Object with new features or None

Return type:

PandasDiscreteMoveDataFrame

Raises:
  • KeyError – If missing local_label column
  • ValueError – If the data contains only null values

Module contents

Contains the core of PyMove.

MoveDataFrame, PandasMoveDataFrame, DaskMoveDataFrame, PandasDiscreteMoveDataFrame, Grid

class pymove.core.MoveDataFrameAbstractModel[source]

Bases: abc.ABC

all()[source]
any()[source]
append()[source]
astype()[source]
at()[source]
columns()[source]
convert_to(new_type: str)[source]
copy()[source]
count()[source]
datetime()[source]
describe()[source]
drop()[source]
drop_duplicates()[source]
dropna()[source]
dtypes()[source]
duplicated()[source]
fillna()[source]
generate_date_features()[source]
generate_datetime_in_format_cyclical()[source]
generate_day_of_the_week_features()[source]
generate_dist_features()[source]
generate_dist_time_speed_features()[source]
generate_hour_features()[source]
generate_move_and_stop_by_radius()[source]
generate_speed_features()[source]
generate_tid_based_on_id_datetime()[source]
generate_time_features()[source]
generate_time_of_day_features()[source]
generate_weekend_features()[source]
get_bbox()[source]
get_type()[source]
get_users_number()[source]
groupby()[source]
head()[source]
iloc()[source]
index()[source]
info()[source]
isin()[source]
isna()[source]
join()[source]
lat()[source]
len()[source]
lng()[source]
loc()[source]
max()[source]
memory_usage()[source]
merge()[source]
min()[source]
nunique()[source]
plot()[source]
plot_all_features()[source]
plot_traj_id()[source]
plot_trajs()[source]
rename()[source]
reset_index()[source]
sample()[source]
select_dtypes()[source]
set_index()[source]
shape()[source]
shift()[source]
show_trajectories_info()[source]
sort_values()[source]
tail()[source]
time_interval()[source]
to_csv()[source]
to_data_frame()[source]
to_dict()[source]
to_grid()[source]
to_numpy()[source]
values()[source]
write_file()[source]