there is a Pandas dataframe: timestamp visit_id url region user_id traffic_source 0 1549980692 e3b0c44298 https://host.ru/3c19b4ef7371864fa3 Russia b1613cc09f yandex 1 1549980704 6e340b9cff https://host.ru/c8d9213a31839f9a3a Germany 4c3ec14bee direct 2 1549980715 96a296d224 https://host.ru/b8b58337d272ee7b15 USA a8c40697fb yandex 3 1549980725 709e80c884 https://host.ru/b8b58337d272ee7b15 Italy 521ac1d6a0 yandex 4 1549980736 df3f619804 https://host.ru/b8b58337d272ee7b15 Russia d7323c571c yandex Create a new dataframe summary in which count how many traffic_source values relate to each region
def count_sources_per_region(df): return df.groupby(['region'])['traffic_source'].count()