Goodbye Freebase, Hello WikiData!

At the end of this month, Google has pledged to make Freebase read-only. The end of an era! The dawn of a bright new age! What’s Freebase? These are just some of the reactions I have had, heard, or made up in my head since the announcement.

If you are already familiar with Freebase, even if you, like me, only really log in when you have a new client or have come across a really interesting thing that, for whatever reason, doesn’t pop up in Google’s Knowledge Graph then this article for you.

If you need to know more about what Freebase is (was, kinda), then before you dive in this check out the Wikipedia entry.

Using And Abusing Freebase

Now, Freebase was an unbelievably potent weapon in the SEO arsenal, and if you used it right you were making a real contribution, however slight, to the sum of human knowledge and understanding. It’s rare to see such a satisfying combination – often the most immediately effective SEO techniques are distinctly unsavoury, so an immediately, often dramatically effective White Hat technique is a welcome step towards redressing the balance.

Referring to it as a ‘weapon’, though, is already treading on what I consider to be unethical ground. The purpose of Freebase is clearly to catalogue as much of the world as possible – using it with the sole intention of pushing an agenda is dubious, and far too easy. I believe that anyone could easily use Freebase to manipulate the SERPs and Knowledge Graph if they wanted to.

The only things stopping us are Freebase’s unfriendly UI, outdated documentation and the pleasant and engaged but massively over-stretched community. I’m not complaining here – abusing Freebase for anything other than your own childish amusement is pretty questionable in my book, and I suspect that using it for subtle negative SEO attacks could be possible. Given the relatively lax notability guidelines of Freebase, and the potential for mischief, the inaccessibility of Freebase is a feature, not a bug!

So, currently I believe that Freebase is useful because of an easy-going, small community and how irritating it is to actually use Freebase. I simultaneously believe that Freebase is not useful and even potentially harmful because of an easy-going, small community and how irritating it is to actually use Freebase.

Why The Change?

WikiData has a familiar format for readers, a stable third party supporting it, and a slick UI. Its name communicates clearly what it is and (roughly) what its intentions are. It can draw on information from other Wiki-x sources easily. It’s generally more accessible.

Freebase has extremely lax notability guidelines, while Wikipedia has relatively tight notability guidelines. It’s unlikely that a majority of regional or small businesses will qualify for their own Wikipedia entry, but a Freebase entry may well be useful. WikiData falls somewhere in between these two extremes.

The WikiData notability guidelines are effectively anything on a Wiki site, or anything that is an identifiable entity that can be described using serious public references, or anything that contributes to the structural integrity of the data.

As a completely fictitious example, a particular obscure poodle cross might be a necessary entry on a list of poodle crosses, but fail all other tests. Freebase essentially asks “Does it exist? Are you lying? Welp, that seems fine, on you go.” Wikipedia asks “Are there serious respectable sources? No? It’s just a poodle cross? Not notable, rejected!” WikiData, in contrast to both, might ask if the list of poodle crosses requires at least some non-notable examples in order to make sense, before accepting the particular poodle cross example.

In this way, we’ve got a promising platform that’s potentially much easier to use by people other than academics, programmers and SEOs, and much easier for everyone to contribute to. This is a really exciting and promising step!

Freebase As DMoz MK II

DMoz functioned similarly to Freebase – in fact, the parallels are pretty striking. A powerful effect on SEO, a relatively small community of custodians and worker bees keeping it in shape, not much use to the average reader but a great source of information for programs. DMoz may be less useful for SEO now, but the rest of the comparisons hold.

I suspect that it’s these similarities that made the move away from Freebase necessary. DMoz has been extremely backlogged for years, prompting allegations of corruption and conflict of interest from some sections of the digital marketing community. I’m fairly confident that Google has no interest in encouraging further theories about its various inner machinations, and so a reputable, independent, well-maintained new platform for data would be perfect for them.

The Risks And Strengths Of WikiData

Both a risk and a strength of WikiData is that it is governed by consensus. This makes it flexible (so it is not vulnerable to abuse of loopholes) but also potentially inconsistent. We may end up with a situation where legacy data remains that does not line up with contemporary notability rules, or is even recorded in the incorrect way.

A significant strength, and huge improvement over Freebase, is that there is extensive and up-to-date information on how WikiData works on the site itself. This suggests that there will be a smaller stream of mistakes to correct from new users, inexperienced users, and experienced users who just happened to forget the right way to link two items.

A second strength is WikiData’s users, which has already achieved great things. Check out, for instance, this map of intellectual influence in the ancient world – a genuinely interesting data visualisation that teaches us something new, and something that can’t be learned by looking at individual nodes in a traditional model like Wikipedia’s. While Freebase could be used in similar ways, projects based on it seem to be harder to find, more private.

The biggest risk of WikiData is that it rejects a straight import from Freebase. This would be a massive blow, as from June 30^th Freebase will no longer be accessible even for reading, and much of its knowledge will be lost.

How Should We Approach WikiData as SEOs?

If your client is not notable or a useful addition to WikiData, then it might be the case that your Freebase data is already excluded from Google’s Knowledge Vault as having a low confidence rating. Adding to WikiData will probably just result in an eventual removal.

If your client is notable by WikiData’s standards, and is already on Wikipedia, you probably don’t need to do anything – your data will probably already be on WikiData, which you can check by looking up, say, I don’t know, SEOMoz, whose company page on WikiData has been entirely created by bots. If you are already on Wikipedia it seems unlikely that you need more weight behind your Knowledge Graph entry in any case.

So, if your client is notable by WikiData’s standards and is not already on Wikipedia, you will want to act. If, by the time this post is live, Freebase is still accepting writes, record your company’s important details there while you still can. Then, if you haven’t already, make sure that your business’ details on Google My Business are up to date, and record them on reputable and descriptive directories. These are more important for the Knowledge Graph, in my personal experience, than Freebase ever was.

Finally, create a new item on wikidata.org, give it a label that will be the minimum possible accurate label for your client, and a description that is the minimum possible accurate label for your client that also distinguishes your client from other things with the same label. Then, add statements – you will need at least a statement that claims your client is an instance of something, probably a company, and a statement that registers your client’s official website. From there, it should be fairly obvious how to proceed if you’ve used Freebase before.

Again, it’s important not to add items unless you genuinely believe they contain helpful, accurate and valid information.

Sources For The Knowledge Graph

As far as I’m aware, the main sources for the knowledge graph that may be relevant to a site’s SEO and informative knowledge cards are, in ascending order of importance, respectable directories, Freebase/WikiData, Google My Business, and Wikipedia.

Alternative sources are likely to be of limited use to an SEO, but are known to include the CIA World Factbook.

As long as we respect these sources for the knowledge graph, we can help our clients generate more relevant traffic as well as add to the world’s sum of knowledge. If the SEO community attempts to abuse these resources, we can expect a corresponding devaluation of our efforts and of the efforts of all the other contributors to these projects.