假设您的数据已经按主题,学生然后按等级排序。如果没有,请先对其进行排序。
#generate the reply_count for each valid combination by comparing the current row and the row above.
count_list = df.apply(lambda x: [df.ix[x.name-1].student if x.name >0 else np.nan, x.student, x.level>1], axis=1).values
#create a count dataframe using the count_list data
df_count = pd.DataFrame(columns=['st_source','st_dest','reply_count'], data=count_list)
#Aggregate and sum all counts belonging to a source-dest pair, finally remove rows with same source and dest.
df_count = df_count.groupby(['st_source','st_dest']).sum().astype(int).reset_index()[lambda x: x.st_source != x.st_dest]
print(df_count)
Out[218]:
st_source st_dest reply_count
1 a b 4
2 b a 2
3 b c 1
4 c a 1
5 c b 1