使用agg@ayhan进行编辑(比应用要快得多)。
from collections import Counter
df.groupby("id")["val"].agg(lambda x: Counter([a for b in x for a in b]))
出:
id
a {'val2': 2, 'val6': 1, 'val7': 1, 'val1': 1}
b {'val9': 1, 'val33': 1, 'val6': 1}
Name: val, dtype: object
此版本的时间:
%timeit df.groupby("id")["val"].agg(lambda x: Counter([a for b in x for a in b]))
1000 loops, best of 3: 820 µs per loop
@ayhan版本的时间:
%timeit df.groupby('id')["val"].agg(lambda x: pd.Series([a for b in x.tolist() for a in b]).value_counts().to_dict() )
100 loops, best of 3: 1.91 ms per loo