Sign in
Google - Google News

Help forum Feed Feed

Help forum > Google News > Google News Publishers - Troubleshooting > Google News says we're included but our stories don't appear..

Google News says we're included but our stories don't appear.. Report abuse

pernod
Level 1
9/29/09
We're imarketnews.com.

We have a sitemap-news.xml, the google webmaster tools seems to like it, and has recognized it as a news-format sitemap.  It's being updated frequently with stories from the last four days (re-checked every minute, I think.)  The urls are fine, 3+ digits, doesn't look like a year.  One domain involved only.  HTML navigation to the relevant pages.  The stories are in html format (not javascript-embedded or anything) -- admittedly, we need to tweak the teaser format a bit (working on it..)  We have at least 25 stories a day, often many more.

No frames, english language, each articles has a link to it, the main pages never change their own urls, it's not a registration or subscription necessary site.  Google web search seems to find many, many pages, just fine.  The robots.txt file has nothing exotic in it, and does not specify any user-agent by name (so presumably what google web search sees, google news search does.)  URLs are canonicalized to imarketnews.com urls.

However, google news seems to have picked up a grand total of one of our articles, and no others.  site:imarketnews.com search reveals so.   Google webmaster tools doesn't point at any crawl errors, and has a healthy crawl total.  Google Webmaster tools isn't showing anything of consequence in the diagnostics. 

We bring it to the news team's attention and we get quoted back the standard boilerplate (a unique url for each articles full text, a url with a unique number, a fixed main page url, and html links) -- which I understand, they're busy, and they can't help everyone, but it seems to us that the site complies with all those requirements, none of which are particularly out of the ordinary anyway -- it's just not posting to the google news site.  We've reviewed the technical requirements at http://www.google.com/support/news_pub/bin/topic.py?hl=en&topic=11665 and complied with all of them too, that we can tell.

Any hints would be appreciated.

-p






Replies 1 - 7 of 7

Inbal
Google Employee
10/1/09
Hi Pernod,

Your article content appears to consist only of isolated sentences not grouped into paragraphs, therefore, we won't be able to crawl it.
Try formatting your articles into text paragraphs of a few sentences each.

Hope this helps,
Inbal
References:
Did you find this answer helpful? Sign in to vote. Report abuse
pernod
Level 1
10/1/09
I will have this looked into.  I think I will have to have this done programmatically on the site, but I appreciate the hint and will let you know how it goes.
Did you find this answer helpful? Sign in to vote. Report abuse
pernod
Level 1
11/4/09
I appreciated your response.  :)

We're working out the how on having more stories get news crawled.  I see in documentation that under google webmaster tools, diagnostics, there can be a 'news crawl' errors, to look for news-specific problems.  I am apprised that we've been approved for inclusion in the news, but I do not have the news crawl option under my diagnostics.  Is 'news crawl' or 'news crawl errors' under Diagnostics in the webmaster tools no longer available?  If it is available, how can my webmaster tools get included so that I can see why specific stories aren't getting news crawled?  It would make it much easier to figure out these problems for ourselves, if it's available to us/me.


Did you find this answer helpful? Sign in to vote. Report abuse
Inbal
Google Employee
11/4/09
Hi Pernod,

In your Diagnostics, go to Crawl errors, and click the News tab on the right. You should be able to see the News-specific errors there.

Hope this helps!
Inbal
Did you find this answer helpful? Sign in to vote. Report abuse
pernod
Level 1
11/4/09
Okay, thank you, there it is, much appreciated.  I sort of suspected that menu had been subsumed -- didn't much make sense to have each type of crawl error have a second order menu, so yeah.  My eyes still aren't finding the right places in the menus.  No news-specific errors, anyway, though, so the team's still working away at it.. :)
Did you find this answer helpful? Sign in to vote. Report abuse
pernod
Level 1
11/4/09
Aha, actually the other version of the same site found alot of things to look at!  Yaaay.  :)
Did you find this answer helpful? Sign in to vote. Report abuse
pernod
Level 1
11/4/09
Okay, I'm seeing no sentences errors for a lot of pages.  The information sources from a feed of text that is usually 60-70 characters across, and often wants that formatting (that is to say that the line-breaks should appear where the source information wants it) rather than what we'd acknowledge is the standard policy for HTML which is.. let the much smarter than all of us browsers perform appropriate wrapping themselves.

Sooo.. my suspicion is that our attempt to control the line-breaks is what makes the news crawler quite unhappy -- is there a more news crawl friendly way to present those breaks and not have it get quite so unhappy?  Or do we have to pretty much have to consider trying to live without source-controlled line breaks?
Did you find this answer helpful? Sign in to vote. Report abuse

Post reply

Sign in to answer this question.

Subscribe

Go to:

Tell us how we're doing: Please answer a few questions about your experience to help us improve our Help Center.