Generation

generate functionTue, 24 Jan 2023

There is a Pandas dataframe: userId movieId rating timestamp 0 1 31 2.5 1260759144 1 1 1029 3.0 1260759179 2 1 1061 3.0 1260759182 3 1 1129 2.0 1260759185 4 1 1172 4.0 1260759205 Calculate the average lifetime of users who gave more than 100 ratings. The lifetime is the difference between the maximum and minimum values of the timestamp column for a given userId.

df = pd.read_csv('ratings.csv') df.head() df.columns df.groupby('userId').size() df.groupby('userId').size().index df.groupby('userId').size().values #df.groupby('userId').size().values > 100 df.groupby('userId').size().values[df.groupby('userId').size().values > 100] df.groupby('userId').size().index[df.groupby('userId').size().values > 100] df[df['userId'] == 1] df[df['userId'] == 1].timestamp df[df['userId'] == 1].timestamp.diff() df[df['userId'] == 1].timestamp.diff().min() df[df['userId'] == 1].timestamp.diff().max() df[df['userId'] == 1].timestamp.diff().max() - df[df['userId'] == 1].timestamp.diff().min() def get_lif

Questions about programming?Chat with your personal AI assistant