Stack Overflow: Voting Patterns in Detail
Up, Down, all around. Offensive? Close? Spam! Inform Moderator...
Continuing to investigate user voting patterns on Stack Overflow has become a hobby (obsession?) of mine. Thanks in part to my curiosity and in part to nobody_ (known know mysteriously as 'Kyle Cronin'; the administrator of the Unofficial Stack Overflow Meta Discussion Forum) egging me on, I quickly whipped up a graph showing the propensity to "Up Vote versus Reputation".
It is very clear that users with higher reputations are more likely to down vote. But this lead to other questions, such as:

Notes on the above graph:

It seems that the new voting options do not impact up/down voting patterns significantly. Note the sudden growth of 'close' votes on the trailing week or two of the graph. It seems to be that this is a change in the raw data rather than a sudden burst of close votes, but am not sure because I myself did not rank the power to vote for closing a question until around that time. Also, 'close' votes are only valid for questions and not answers, unlike up and down votes.
The last few weeks of the data dump look interesting, so I zoomed in there and produced the below graph.

Up Vote (as a percentage; % = up / (up + down)) for five reputation tiers of users with at least one up vote and one down vote and a reputation of at least 100 (when one is allowed to down vote.) The x-axis represent five user tiers.
The first three represent ~5,000 each, the fourth ~1,500 and the fifth ~125.
The first three represent ~5,000 each, the fourth ~1,500 and the fifth ~125.
It is very clear that users with higher reputations are more likely to down vote. But this lead to other questions, such as:
- Do the users with older accounts, especially beta users, make up the negative voting club?
- Did the down votes shift downwards because of new features introduced to Stack Overflow, specifically new voting options like 'Spam', 'Offensive', 'Inform Moderator' and 'Close'?

Notes on the above graph:
- The percentage is only for users with
- at least one up vote
- one down vote
- reputation of at least 100
- at least one up vote
- The yellow data point on the Average Reputation series represents the day Stack Overflow sign-ups were open to the general public. Our own little Eternal September, if you will. (Not that bad, actually.)
- The purple spike in the %-Up Votes is caused by Niel Butterworth, who has both many votes (~1700 in the data dump) and a 50/50 Up Vote versus Down Vote ratio.
- The leveling of the average reputation curve a few weeks after (end of September/early October) Stack Overflow went public is interesting. It seems, to no surprise, that beta users and the initial public users are much more into SO than the follow up users.
- The far left data point represents seven users who got accounts on 31 July 2008, and are the movers-n-shakers of Stack Overflow. (Jeff Atwood, Jarrod Dixon, Joel Spolsky, and Jon Galloway) who understandably have very high reputation scores.

It seems that the new voting options do not impact up/down voting patterns significantly. Note the sudden growth of 'close' votes on the trailing week or two of the graph. It seems to be that this is a change in the raw data rather than a sudden burst of close votes, but am not sure because I myself did not rank the power to vote for closing a question until around that time. Also, 'close' votes are only valid for questions and not answers, unlike up and down votes.
The last few weeks of the data dump look interesting, so I zoomed in there and produced the below graph.

(Larger Image)
The burst and subsequent tapering off of Spam, Offensive and Inform Moderator votes seems very suspicious to me. Was this actual activity? Or was there a data collection issue? Or was that when these voting features were created? I'm guessing it was a data collection issue. Future data dumps will show this to be true or not, I hope.
Whatever the cause, the number of votes here is still to small to impact the percentage of up votes over time to any degree. My conclusion is that
Whatever the cause, the number of votes here is still to small to impact the percentage of up votes over time to any degree. My conclusion is that
- SOpedians with high reputations are more likely to vote down questions and answers
- As Stack Overflow gains more and more users with lower reputations, these users are less likely to vote down and bring up the over all percentage of up votes against all votes over time.