I wouldn't get my hackles up about this helping people shit in the well. I failed to finish the article because he has no clue what he is talking about. For example, Ask HN does more poorly in part because the system actively penalizes Asks compared to urls, something he is clearly not aware of.
I sometimes post consistently. At one point, I was pretty good at getting stuff onto the front page. I paid no attention to posting time, number of words in the title, etc.
I paid attention to good content and good headlines. A so-so headline with excellent content fares better than a great headline and so-so content.
Periods where I post crappy articles that no one cares about seem to hurt my numbers long term. It is better to post good stuff less often. It seems to take time to recover after a period of posting too many Meh articles before people will bother to check out my submissions.
I see no reason to infer that Sunday is a better day to post some piece of crap you wish would go viral. Perhaps articles submitted on Sunday are generally better quality because it is a day off for most people, so perhaps they have more time to read, conclude something is HN worthy, and take the time to post it.
The metrics used do not strike me as especially useful or meaningful. I did experiment for a time with deleting my own blog posts and renaming them. A better title helps get initial traffic. I stopped doing that because I suspect too many people notice, remember and pass on clicking on my domain name because of it.
I have had people tell me they pay no attention to member names. This really does not fit with my subjective experience. They may pay no conscious attention, but the info is there and I strongly suspect it has an effect.
I mean, I can't prove any of that. But correlation does not prove causation and the author has no deep knowledge of the forum and the community here. While it may be helpful to, say, keep titles shorter, I think most of this data isn't particularly useful.
yes, what the world needs now is more and more people who care about marketing their brand. this might as well be called "a guide to shitting in the well: how to poison the HN community so you can go viral"
I think marketeers have already found out. Maybe I'm just being jaded and/or my interests have evolved, but seems like more and more techno-nonsense hits the front page in recent months than years ago.
It's up to the community to police the content, domains are now clickable and have a submission history page just like users. It only takes one person to notice exploitative behavior for the community or moderators to intervene.
I think it's getting better though, it seems like plenty of startups can artificially hit the front page a few times but once the voting ring filter's got all their employees and friends it is much harder.
and as the eternal septembering continues, newbies upvote the bad content, whilst posting comments filled with bad reddit puns, SJW crap, and "this." as the entire content of their reply.
I agree. The dataset has been around for a while now. Also, even though you can increase your chances a few percent by hitting the right time, the users are still the ones to decide.
I'd give up all my karma and ability to upvote if it meant that half the HN users with downvoting ability had it taken away. Especially the ones that downvote before replying.
There's 1 flaw in this type of analysis and that is an unmeasurable behavioural flaw.
Let me put it this way:
If this post of "how to be successful on HN" hits critical mass and gets 400 votes, lingers for an entire 24 hours on the front page and gets loads of clicks (and it contains definitive evidence of "when", "what", "how" to post), it will lead to the opposite happening...
How?
Well, hundreds of folks will try it out, increasing the "new" rate significantly on the "best days/times to link", making it even harder for anybody to upvote your article, cause the duration on the first page of "new" will be a lot less than normal.
The real secret to getting popular on HN is to form voting-rings and upvote your buddies posts ("you scratch my back, I scratch yours").
Alternatively, you can just buddy up with the people who run HN and get them to manipulate your posting by allowing multiple levels of rigging (I've read somewhere that they "allow good posts that were missed to be brought back to life", which makes you wonder what the definition of a "good post" is).
It would be interesting to see who are the most successful submitters, that is the users with the highest ratio of submitted posts which have reached the front-page vs overall submitted posts.
SELECT author, COUNT(author) as num_submissions,
ROUND(SUM(score >= 10)/COUNT(author), 3) AS perc
FROM [fh-bigquery:hackernews.stories]
GROUP BY author
HAVING num_submissions >= 10
ORDER BY perc DESC
LIMIT 100
I looked for that too in the article and missed it, too. But then I thought it might not be that easy. I suspect there will be a lot of users with a single luck shot or two, ruining the statistics
If I remember correctly the leaders list (https://news.ycombinator.com/leaders) had a third column in the past simply named Avg. I showed Karma per post.
I did an analysis of titles a year ago (http://minimaxir.com/2014/02/hacking-hacker-news/ ). There are a few formatting choices which make the charts very hard to read. Ordering weekdays from Monday-Sunday instead of Sunday-Saturday hides the boundary conditions. Analyzing topics by # words instead of # characters may not give the best results since a HN titles tend to be idiosyncratic, etc.
From my experience and analysis a year later, HN is much, much more luckbased than Reddit, but the new repost rules alleviate that.
I can't tell for sure, but it doesn't seem like this analysis normalizes for total submissions in each category. For example, if there are very few Ask HN submissions, but all of them make the top 3%, then his charts would say that Ask HN is not very popular.
What you really want to know is whether posts with a particular attribute reach the top out of proportion with their representation in the total population.
I think "time on the front page" would be a better measure of popularity.
Even better, "total site-wide HN comments made while the article was on the front page" as a proxy for the number of users being online and thus having seen it.
Unfortunately that's a) impossible to see in historical data unless you had a huge number of snapshots and b) the algorithm changed very recently so that wouldn't help.
The first and second graphs don't seem to match the surrounding text. The Y axis is absolute count instead of popularity %. Apart from that, great article.
> So all in all, statistically, you can maximize your chances of getting a popular HN post by ...
It's the same in all groups. You get popular by posting content that conforms to the group-think and avoid content that doesn't. This leads to uniform, harmonic but also eventually uninteresting groups that sanction deviant behavior.
I sometimes post consistently. At one point, I was pretty good at getting stuff onto the front page. I paid no attention to posting time, number of words in the title, etc.
I paid attention to good content and good headlines. A so-so headline with excellent content fares better than a great headline and so-so content.
Periods where I post crappy articles that no one cares about seem to hurt my numbers long term. It is better to post good stuff less often. It seems to take time to recover after a period of posting too many Meh articles before people will bother to check out my submissions.
I see no reason to infer that Sunday is a better day to post some piece of crap you wish would go viral. Perhaps articles submitted on Sunday are generally better quality because it is a day off for most people, so perhaps they have more time to read, conclude something is HN worthy, and take the time to post it.
The metrics used do not strike me as especially useful or meaningful. I did experiment for a time with deleting my own blog posts and renaming them. A better title helps get initial traffic. I stopped doing that because I suspect too many people notice, remember and pass on clicking on my domain name because of it.
I have had people tell me they pay no attention to member names. This really does not fit with my subjective experience. They may pay no conscious attention, but the info is there and I strongly suspect it has an effect.
I mean, I can't prove any of that. But correlation does not prove causation and the author has no deep knowledge of the forum and the community here. While it may be helpful to, say, keep titles shorter, I think most of this data isn't particularly useful.