pymove.semantic package

Submodules

pymove.semantic.semantic module

Semantic operations.

outliers create_or_update_out_of_the_bbox, create_or_update_gps_deactivated_signal, create_or_update_gps_jump, create_or_update_short_trajectory, create_or_update_gps_block_signal, filter_block_signal_by_repeated_amount_of_points, filter_block_signal_by_time, filter_longer_time_to_stop_segment_by_id

pymove.semantic.semantic.create_or_update_gps_block_signal(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', max_time_stop: float = 7200, new_label: str = 'block_signal', label_tid: str = 'tid_part', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Creates a new feature that inform segments with periods without moving.

Parameters:
  • move_data (dataFrame) – The input trajectories data.
  • max_time_stop (float, optional) – Maximum time allowed with speed 0, by default 7200
  • new_label (string, optional) – The name of the new feature with detected deactivated signals, by default BLOCK
  • label_tid (str, optional) – The label of the column containing the ids of the formed segments, by default TID_PART Is the new slitted id.
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

DataFrame with the additional features or None ‘dist_to_prev’, ‘time_to_prev’, ‘speed_to_prev’, ‘tid_dist’, ‘block_signal’

Return type:

DataFrame

pymove.semantic.semantic.create_or_update_gps_deactivated_signal(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', max_time_between_adj_points: float = 7200, new_label: str = 'deactivated_signal', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Creates a new feature that inform if point invalid.

If the max time between adjacent points is equal or less than max_time_between_adj_points.

Parameters:
  • move_data (dataframe) – The input trajectories data.
  • max_time_between_adj_points (float, optional) – The max time between adjacent points, by default 7200
  • new_label (string, optional) – The name of the new feature with detected deactivated signals, by default DEACTIVATED
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

DataFrame with the additional features or None ‘time_to_prev’, ‘time_to_next’, ‘time_prev_to_next’, ‘deactivate_signal’

Return type:

DataFrame

pymove.semantic.semantic.create_or_update_gps_jump(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', max_dist_between_adj_points: float = 3000, new_label: str = 'gps_jump', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Creates a new feature that inform if point is a gps jump.

A jump is defined if the maximum distance between adjacent points is greater than max_dist_between_adj_points.

Parameters:
  • move_data (dataframe) – The input trajectories data.
  • max_dist_between_adj_points (float, optional) – The maximum distance between adjacent points, by default 3000
  • new_label (string, optional) – The name of the new feature with detected deactivated signals, by default GPS_JUMP
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

DataFrame with the additional features or None ‘dist_to_prev’, ‘dist_to_next’, ‘dist_prev_to_next’, ‘jump’

Return type:

DataFrame

pymove.semantic.semantic.create_or_update_out_of_the_bbox(move_data: DataFrame, bbox: tuple[int, int, int, int], new_label: str = 'out_bbox', inplace: bool = False) → DataFrame | None[source]

Create or update a boolean feature to detect points out of the bbox.

Parameters:
  • move_data (dataframe) – The input trajectories data.
  • bbox (tuple) – Tuple of 4 elements, containing the minimum and maximum values of latitude and longitude of the bounding box.
  • new_label (string, optional) – The name of the new feature with detected points out of the bbox, by default OUT_BBOX
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

Returns dataframe with a boolean feature with detected points out of the bbox, or None

Return type:

DataFrame

Raises:

ValueError – If feature generation fails

pymove.semantic.semantic.create_or_update_short_trajectory(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', max_dist_between_adj_points: float = 3000, max_time_between_adj_points: float = 7200, max_speed_between_adj_points: float = 50, k_segment_max: int = 50, label_tid: str = 'tid_part', new_label: str = 'short_traj', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Creates a new feature that inform if point belongs to a short trajectory.

Parameters:
  • move_data (dataframe) – The input trajectory data
  • max_dist_between_adj_points (float, optional) – Specify the maximum distance a point should have from the previous point, in order not to be dropped, by default 3000
  • max_time_between_adj_points (float, optional) – Specify the maximum travel time between two adjacent points, by default 7200
  • max_speed_between_adj_points (float, optional) – Specify the maximum speed of travel between two adjacent points, by default 50
  • k_segment_max (int, optional) – Specify the maximum number of segments in the trajectory, by default 50
  • label_tid (str, optional) – The label of the column containing the ids of the formed segments, by default TID_PART
  • new_label (str, optional) – The name of the new feature with short trajectories, by default SHORT
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

DataFrame with the aditional features or None ‘dist_to_prev’, ‘time_to_prev’, ‘speed_to_prev’, ‘tid_part’, ‘short_traj’

Return type:

DataFrame

pymove.semantic.semantic.filter_block_signal_by_repeated_amount_of_points(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', amount_max_of_points_stop: float = 30.0, max_time_stop: float = 7200, filter_out: bool = False, label_tid: str = 'tid_part', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Filters from dataframe points with blocked signal by amount of points.

Parameters:
  • move_data (dataFrame) – The input trajectories data.
  • amount_max_of_points_stop (float, optional) – Maximum number of stopped points, by default 30
  • max_time_stop (float, optional) – Maximum time allowed with speed 0, by default 7200
  • filter_out (boolean, optional) – If set to True, it will return trajectory points with blocked signal, by default False
  • label_tid (str, optional) – The label of the column containing the ids of the formed segments, by default TID_PART
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

Filtered DataFrame with the additional features or None ‘dist_to_prev’, ‘time_to_prev’, ‘speed_to_prev’, ‘tid_dist’, ‘block_signal’

Return type:

DataFrame

pymove.semantic.semantic.filter_block_signal_by_time(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', max_time_stop: float = 7200, filter_out: bool = False, label_tid: str = 'tid_part', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Filters from dataframe points with blocked signal by time.

Parameters:
  • move_data (dataFrame) – The input trajectories data.
  • max_time_stop (float, optional) – Maximum time allowed with speed 0, by default 7200
  • filter_out (boolean, optional) – If set to True, it will return trajectory points with blocked signal, by default False
  • label_tid (str, optional) – The label of the column containing the ids of the formed segments, by default TID_PART
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

Filtered DataFrame with the additional features or None ‘dist_to_prev’, ‘time_to_prev’, ‘speed_to_prev’, ‘tid_dist’, ‘block_signal’

Return type:

DataFrame

pymove.semantic.semantic.filter_longer_time_to_stop_segment_by_id(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', dist_radius: float = 30, time_radius: float = 900, label_id: str = 'id', label_segment_stop: str = 'segment_stop', filter_out: bool = False, inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Filters from dataframe segment with longest stop time.

Parameters:
  • move_data (dataFrame) – The input trajectories data.
  • dist_radius (float, optional) – The dist_radius defines the distance used in the segmentation, by default 30
  • time_radius (float, optional) – The time_radius used to determine if a segment is a stop, by default 30 If the user stayed in the segment for a time greater than time_radius, than the segment is a stop.
  • label_tid (str, optional) – The label of the column containing the ids of the formed segments, by default TRAJ_ID
  • label_segment_stop (str, optional) – by default ‘segment_stop’
  • filter_out (boolean, optional) – If set to True, it will return trajectory points with longer time, by default True
  • inplace (boolean, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

Filtered DataFrame with the additional features or None ‘dist_to_prev’, ‘time_to_prev’, ‘speed_to_prev’, ‘tid_dist’, ‘block_signal’

Return type:

DataFrame

pymove.semantic.semantic.outliers(move_data: 'PandasMoveDataFrame' | 'DaskMoveDataFrame', jump_coefficient: float = 3.0, threshold: float = 1, new_label: str = 'outlier', inplace: bool = False) → 'PandasMoveDataFrame' | 'DaskMoveDataFrame' | None[source]

Create or update a boolean feature to detect outliers.

Parameters:
  • move_data (dataframe) – The input trajectory data
  • jump_coefficient (float, optional) – by default 3
  • threshold (float, optional) – Minimum value that the distance features must have in order to be considered outliers, by default 1
  • new_label (string, optional) – The name of the new feature with detected points out of the bbox, by default OUTLIER
  • inplace (bool, optional) – if set to true the original dataframe will be altered to contain the result of the filtering, otherwise a copy will be returned, by default False
Returns:

Returns a dataframe with the trajectories outliers or None

Return type:

DataFrame

Module contents

Contains semantic functions that adds new infomation to the trajectories.

semantic