Whitehouse.gov Keeping Searchers From Asking Questions (Inadvertently)
Yesterday on the O’Reilly Radar blog, I posted that a key component of the American government’s initiative to be transparent and open with its data is to ensure content from government web sites is available through the major search engines. Tomorrow, I’ll follow that up with a post that details some of the specific obstacles that are blocking them now and how they might become more search engine-friendly.
While I was working on that post today, I checked out the new section of whitehouse.gov called “Open for Questions“. Since this is a brand new feature and has been talked up quite a bit online, I thought it would be a good study on government efforts do in search.
The trouble with Google Moderator
Open for Questions uses Google Moderator, which as I’ve previously posted, is a black hole for search engines. That’s unfortunate, since people who are searching for information on health care reform, financial stability, or the environment would likely benefit from finding a lively discussion on whitehouse.gov with concerns from fellow citizens. Another of Google Moderator’s drawbacks is that it doesn’t provide a unique URL for each topic, so for instance, if I wanted to email a link to the discussion about retirement security to my mom or post a link to the education discussion on Facebook, I couldn’t. The best I could do is tell people to go to http://www.whitehouse.gov/OpenForQuestions/ and then scroll to the topic and click on it. If I wanted to share a specific question from one of those topics? Not a chance.
The semantic structure of the page
I then noticed that the title tag of the page uses the text “Open For Questions” with no additional clue that this page is part of whitehouse.gov or that these questions are intended to be from citizens to the White House. That’s not great for either search ranking or search acquisition. If searchers see a listing in the search results for “Open For Questions”, it could mean anything. (Much of whitehouse.gov has title tag issues.)
The heading on the page is “Your Questions on the Economy”. That’s kind of confusing, since you can use the topic navigation to ask questions about all kinds of things. And the heading is in a span class, not an H1, so it doesn’t signal clearly to search engines that this is what the page is about. The text mentions “Thursday”, but no date, which makes me wonder if these discussions will no longer be online after tomorrow. If they are, having the date on the page from the start would make the page a lot easier to maintain later.
Of course, since all of the text is pulled in from Google Moderator via JavaScript, the search engines (and possibly those on mobile devices, screen readers, and older browsers) can’t access any of it anyway. Instead any of those users see this: “Google Moderator is a tool that allows distributed communities to submit and vote on questions for talks, presentations, and events. You must have JavaScript enabled in order to use this feature.”
The real problem: indexability
But as it turns out, all of those obstacles are inconsquential. A search for [white house open for questions] brings up lots of pages that reference the feature, but not the http://www.whitehouse.gov/OpenForQuestions/ page itself.
I was initially surprised because even though much of the content on the page is inaccessible to search engines, the pages has lots of links with descriptive anchor text. And then I realized this isn’t a ranking problem, it’s an indexing problem.
Surely the search engines have seen the page by now. I checked robots.txt but didn’t see anything blocking it. And then I looked at the source code. There it was.
<meta content=”noindex,nofollow“ name=”robots“/>
They didn’t follow the first rule of indexing.
My guess is that blocking search engines is either accidental (which happens all the time when pages are first launched, since they are often blocked during the development phase) or this page is intended to be only temporary, meant to gather questions for a short period of time, but not keep them as an archive later. I hope the latter isn’t the case, because I think having a repository of the questions Americans care about most can be very useful for a number of reasons.
How can this be fixed?
It may be difficult to fix the issues with indexing Google Moderator content (and that’s really something I wish Google itself would address), but it should be fairly simple to expose the “Open for Questions” landing page to search engines. Just remove the meta robots tag, make the title a bit more descriptive (“Open For Questions: Ask the White House About Your Concerns”), and add a paragraph of text that’s outside of JavaScript that describes what the feature is about. And in the short term, perhaps the content could be exported to an HTML file once questions are closed so they could be searched over later.
Check O’Reilly Radar tomorrow for more tips on how the government can make its content more findable.






Amazing… GoogleMod is not SEO.
They could have gone with Yoosk, but it’s not made in the USA.
Vanessa, I love your ability to dig and discover. A couple days ago I was at the White House site and decided to see how well it’s optimized in general. It turns out it’s not. By a long shot. A friend said “well they’ve probably got enough back-links”… Uh, how many times can it be emphasized that back-links alone do not a quality SEO plan make”?
I did just a random check to see what phrases people might be using to find info that clearly should result in the White House site coming up. Phrases that students might use, for example.
American President – they come up fifth
leader of America – not found
Later on I tried a couple dozen phrases and many had the same results.
Like for the term
American Vice President – one entry in the 9th position – and it’s a press release from February. a 14,000 transcript (yes I checked the word count) of remarks made by the President and Vice President made at a museum in Denver. OMG
The main site’s optimization is awful. Like for the main landing page on the current administration’s agenda, the title is “The Agenda” seriously! And the Fiscal Agenda page – the title is
Fiscal
The Meta keywords field is filled with single words, overstuffed, and the Meta description on every page is the same (and it’s 267 characters long).
I know that there’s a really good chance that people will type a longer phrase, which will probably include “White House” or “Obama”. And that someone who gets to the site could then eventually find their way around.
Yet I can’t help but think about the fact that they’re missing so much of an opportunity. It’s just sad.
As much of a challenge as it would be to optimize the newest features and areas of the site, at least they could hire an intern to work on the basics couldn’t they?
Vanessa, fantastic investigation!
Why did you write “Another of Google Moderator’s drawbacks is that it doesn’t provide a unique URL for each topic”? I can see a link to a moderator.whitehouse.gov URL for each topic — check Education for instance. Isn’t this a permanent link? I agree that a link to each question would be best though (and preferably without noindex, nofollow meta)
Anyway, I am happily surprised that one doesn’t have to sign in with a google account before participating!
Keonda,
Thanks!
There’s a couple of things going on with the URLs that aren’t ideal. The first is that if you go to whitehouse.gov/openforquestions and click on any of the topics to the left, the URL doesn’t change. That’s how most people will navigate.
If you break out of the frame, as you did, you can get unique URLs (which at least will help with sharing), but those still can’t be indexed because the unique portion of the URL comes after a #. Since those normally signify anchors within a page, search engines don’t index them. They truncate the URL to end just before the #.
I don’t think whitehouse.gov cares or even needs SEO. I wouldn’t doubt if they were deliberately snubbing it. After all it’s whitehouse.gov. There’s a lot of smart people over there and I would venture to guess they know the basics. There also have an on-site search engine, not sure where it comes from. Definitely not Google but it works. If I need something from the White House and I want to be sure it’s from the White House I can go there and get it. As for the Town Hall thing, do you really want to stuff commercial search engines with 100,000 of mostly redundant questions? I go to search engines to get answers. Anyway I’m guessing this is probably some kind of temporary page so perhaps they don’t want a bunch of 404′s if they deactivate it for later use under a different topic. They are actually doing search engines a favor. I for one like the site. I’m sure Google will archive that stuff somewhere since it was their API but probably not in the regular SERP’s.
[...] not quite so cool is that Moderator apparently doesn’t play well with the rest of the web. I’m not sure why it was designed this way (and if I did know, I probably couldn’t tell [...]
[...] Google tools are to developers. Sure, the AppEngine servers running Google Moderator held up, but none of the discussion could be crawled or indexed by search engines. Maybe this didn’t matter to the White House (although I would think that some American [...]