"Pageviews are dead"
Remind you of anything?
Pageviews Above Replacement
(un-juking the stats)
- What if we could control for promotion when judging performance?
- From July - August, I collected data on the promotion and performance of over 21,000 articles published on nytimes.com
Data sources
Promotional Data:
- ~ 200 NYT-related Twitter accounts
- ~ 20 NYT-related Facebook accounts
- ~ 20 section fronts
- One homepage
- One paper
Metadata:
- Article type: (video, slideshow, interactive, article, blogpost)
- Section: (US, World, Art, etc...)
Performance Data:
- Pageviews and Social Media Activity for each article
Predicting pageviews
- Sum all the pageviews for 7 days on the site
- Use promotional features and article metadata to predict this number
- Random Forests (the mode of a bunch of decision trees)
Variable importance
- Time on all section fronts
- Number of unique section fronts
- Was the article in the paper?
- Number of NYT-Twitter followers reached
- Time on homepage
- Number of NYT-tweets
- Is the article from Reuters?
- Is the article from the AP?
- Max rank on homepage
- Word count
So what?
- Placing promotional data alongside pageviews gives us a better understanding of what the metric actually means.
- (NYT) Pageviews are actually fairly predictable (90% of the variance explained in my model)
- Incorporating this approach in your Newsroom should be fairly painless with particle. However, you should first ask yourself what you're optimizing for.
- Predictive analytics can help increase your editorial responsiveness to the reader's preferences, the news cycle - http://fast.qcri.org/.