08 - Exploring Semantic

1. Imports

import pymove as pm
from pymove.semantic import semantic

2. Load Data

move_df = pm.read_csv('geolife_sample.csv')
move_df.head()
lat lon datetime id
0 39.984094 116.319236 2008-10-23 05:53:05 1
1 39.984198 116.319322 2008-10-23 05:53:06 1
2 39.984224 116.319402 2008-10-23 05:53:11 1
3 39.984211 116.319389 2008-10-23 05:53:16 1
4 39.984217 116.319422 2008-10-23 05:53:21 1

Detect outlier points considering distance traveled in the dataframe

outliers = semantic.outliers(move_df)
outliers[outliers['outlier']]
VBox(children=(HTML(value=''), IntProgress(value=0, max=2)))
id lat lon datetime dist_to_prev dist_to_next dist_prev_to_next outlier
148 1 39.970511 116.341455 2008-10-23 10:32:53 1452.319115 1470.641291 71.088460 True
338 1 39.995042 116.326465 2008-10-23 10:44:24 10.801860 10.274331 1.465144 True
8133 1 39.991075 116.188395 2008-10-25 08:20:19 5.090766 6.247860 1.295191 True
10175 1 40.015169 116.311045 2008-10-25 23:40:12 23.454754 24.899678 3.766959 True
13849 1 39.977157 116.327151 2008-10-26 08:13:53 11.212682 10.221164 1.004375 True
... ... ... ... ... ... ... ... ...
216877 5 39.992096 116.329136 2009-03-12 15:57:42 7.035981 6.182086 1.909349 True
216927 5 39.998061 116.326402 2009-03-12 16:02:17 16.758753 19.151449 4.051863 True
217456 5 40.001983 116.328414 2009-03-19 04:35:52 179.564668 191.030434 15.276237 True
217465 5 40.001433 116.321387 2009-03-19 04:41:02 77.928727 75.686512 16.676141 True
217479 5 40.003626 116.317695 2009-03-19 05:02:52 9.725231 7.573682 2.463175 True

383 rows × 8 columns

move_df.get_bbox()
(22.147577, 113.548843, 41.132062, 121.156224)

Detect points outside of a bounding box

fake_bbox = (20, 110, 40, 120)
out_bbox = semantic.create_or_update_out_of_the_bbox(move_df, fake_bbox)
out_bbox[out_bbox['out_bbox']]
lat lon datetime id out_bbox
415 40.000026 116.322214 2008-10-23 10:48:31 1 True
416 40.000082 116.322072 2008-10-23 10:48:33 1 True
417 40.000164 116.321996 2008-10-23 10:48:37 1 True
418 40.000245 116.321964 2008-10-23 10:48:40 1 True
419 40.000312 116.321921 2008-10-23 10:48:45 1 True
... ... ... ... ... ...
217643 40.000205 116.327173 2009-03-19 05:45:37 5 True
217644 40.000128 116.327171 2009-03-19 05:45:42 5 True
217645 40.000069 116.327179 2009-03-19 05:45:47 5 True
217646 40.000001 116.327219 2009-03-19 05:45:52 5 True
217651 40.000015 116.327433 2009-03-19 05:46:17 5 True

104787 rows × 5 columns

Detects points with no gps signal, given by the time between adjacent points

deactivated = semantic.create_or_update_gps_deactivated_signal(move_df)
deactivated[deactivated['deactivated_signal']]
VBox(children=(HTML(value=''), IntProgress(value=0, max=2)))
id lat lon datetime time_to_prev time_to_next time_prev_to_next deactivated_signal
147 1 39.978068 116.327554 2008-10-23 06:01:57 5.0 16256.0 16261.0 True
148 1 39.970511 116.341455 2008-10-23 10:32:53 16256.0 7.0 16263.0 True
960 1 40.013803 116.306531 2008-10-23 12:04:28 2.0 41796.0 41798.0 True
961 1 40.013867 116.306473 2008-10-23 23:41:04 41796.0 2.0 41798.0 True
3088 1 39.977899 116.327063 2008-10-24 06:35:50 2.0 61695.0 61697.0 True
... ... ... ... ... ... ... ... ...
216997 5 40.007003 116.323674 2009-03-13 13:29:06 30157.0 5.0 30162.0 True
217054 5 40.010537 116.322052 2009-03-13 13:34:01 5.0 57426.0 57431.0 True
217055 5 40.009639 116.322056 2009-03-14 05:31:07 57426.0 5.0 57431.0 True
217452 5 39.990464 116.333510 2009-03-14 06:47:12 2.0 424105.0 424107.0 True
217453 5 40.001467 116.326665 2009-03-19 04:35:37 424105.0 5.0 424110.0 True

420 rows × 8 columns

Detects points with jumps, defined by the maximum distance between adjacent points

jump = semantic.create_or_update_gps_jump(move_df, )
print(jump[jump['gps_jump']].shape)
jump[jump['gps_jump']].head()
VBox(children=(HTML(value=''), IntProgress(value=0, max=2)))
(46, 8)
id lat lon datetime dist_to_prev dist_to_next dist_prev_to_next gps_jump
3088 1 39.977899 116.327063 2008-10-24 06:35:50 0.140088 4361.216241 4361.148665 True
3089 1 40.013812 116.306483 2008-10-24 23:44:05 4361.216241 7.587244 4358.356247 True
12434 1 39.974821 116.333828 2008-10-26 03:27:37 1.358606 4536.318481 4536.121843 True
12435 1 39.976599 116.387014 2008-10-26 03:45:46 4536.318481 4.280041 4535.822332 True
23936 1 39.978222 116.327002 2008-10-31 08:06:33 10.665636 4328.751469 4318.102530 True

Determines if a point belongs to a short trajectory.

short = semantic.create_or_update_short_trajectory(move_df)
short[short['short_traj']]
VBox(children=(HTML(value=''), IntProgress(value=0, max=2)))
VBox(children=(HTML(value=''), IntProgress(value=0, max=2)))
VBox(children=(HTML(value=''), IntProgress(value=0, max=2)))
id lat lon datetime dist_to_prev time_to_prev speed_to_prev tid_part short_traj
148 1 39.970511 116.341455 2008-10-23 10:32:53 1452.319115 16256.0 0.089340 2 True
18244 1 39.993663 116.325751 2008-10-27 23:49:47 233.351618 4.0 58.337905 18 True
18795 1 39.983927 116.309349 2008-10-28 13:21:25 480.465717 8147.0 0.058975 22 True
26941 1 39.982361 116.330762 2008-11-01 06:02:13 270.452069 3.0 90.150690 39 True
27878 1 40.017806 116.307530 2008-11-02 09:44:34 454.090137 19527.0 0.023254 44 True
... ... ... ... ... ... ... ... ... ...
217471 5 40.001707 116.318926 2009-03-19 04:42:22 36.849631 5.0 7.369926 404 True
217472 5 40.001784 116.318965 2009-03-19 04:42:27 9.183862 5.0 1.836772 404 True
217473 5 40.002218 116.320306 2009-03-19 04:42:32 123.999483 5.0 24.799897 404 True
217474 5 40.004917 116.314376 2009-03-19 04:49:37 587.526625 425.0 1.382416 404 True
217475 5 40.004955 116.313697 2009-03-19 04:49:42 57.987365 5.0 11.597473 404 True

1151 rows × 9 columns