您可以将函数用作的参数np.vectorize()
,然后可以将其用作的参数,pandas.groupby.apply
如下所示:
haver_vec = np.vectorize(haversine, otypes=[np.int16])
distance = df.groupby('id').apply(lambda x: pd.Series(haver_vec(df.coordinates, x.coordinates)))
例如,具有以下示例数据:
length = 500
df = pd.DataFrame({'id':np.arange(length), 'coordinates':tuple(zip(np.random.uniform(-90, 90, length), np.random.uniform(-180, 180, length)))})
比较500点:
def haver_vect(data):
distance = data.groupby('id').apply(lambda x: pd.Series(haver_vec(data.coordinates, x.coordinates)))
return distance
%timeit haver_loop(df): 1 loops, best of 3: 35.5 s per loop
%timeit haver_vect(df): 1 loops, best of 3: 593 ms per loop