About

The RecSys Challenge 2021 will be organized by Politecnico di Bari, ETH Zürich, Jönköping University, and the data set will be provided by Twitter. The challenge focuses on a real-world task of tweet engagement prediction in a dynamic environment. For 2021, the challenge considers four different engagement types: Likes, Retweet, Quote, and replies.

This year's challenge brings the problem even closer to Twitter's real recommender systems by introducing latency constraints. We will also increase the data size to encourage novel methods. Also, the data density will be increased in terms of the graph where users are considered to be nodes and interactions as edges.

The goal is twofold: to predict the probability of different engagement types of a target user for a set of tweets based on heterogeneous input data while providing fair recommendations. We are conscious that multi-goal optimization considering accuracy and fairness will be particularly challenging. However, we believe that the recommendation community is nowadays mature enough to face the challenge of providing accurate and, at the same time, fair recommendations.

Participation and Data

Twitter will make available a public dataset of close to 1 billion data points, >40 million each day over 28 days. Week 1 - 3 will be used for training and week 4 for evaluation and testing. Each datapoint contains the tweet along with engagement features, user features, and tweet features.

Participation for this challenge is subject to your acceptance of these Terms & Conditions , and your successful completion of the steps required within, including the registration and approval process with the Twitter Developer Program(To Be Announced on Early March)

The Terms & Conditions already require that all submissions are accompanied by reproducible code, so that we can inspect winning solutions in detail: “Your Submission must include the source code and any related information used to derive the results contained in your Submission. The source code must be released under an open-source license (Apache 2.0). A third party should be able to use your submitted source to regenerate your results.” Furthermore, we explicitly state in our rules that NO de-anonymization or access to data from Twitter users and user behavior from the Twitter API other than that in the challenge dataset is allowed. Enriching the data with other data sources remains possible.

If any of the rules mentioned in the Terms & Conditions (and explained further above) are broken and thus discovered by the organizers in the code submission, the participant(s) that the submission belongs to will be disqualified from the competition. As mentioned in the Terms and Conditions: “Organizers reserve the right, in their sole discretion, to disqualify any participant who makes a Submission that does not meet the Requirements or is in violation of these Terms.”

Tentative Timeline

Note: the timeline is subject to slight modifications.

Paper Submission Guidelines



Program Comittee

To be announced

Academic Organizers

Sponsor Mentors

Advisors