We recently made a post in the official Google News Blog on what does
(and does not) improve a story's rank in Google News. If you haven't
seen it yet, read and enjoy:
Now we want to hear from you! Just a quick reminder that this is
primarily your group, a forum to foster your own discussions; that
said, I'll be around to help in the quest to separate fiction from
fact, myth from truth, and urban legend from... whatever isn't an
urban legend.
Additionally, I'm happy to announce a new guest poster in the group.
So if you see a surprisingly knowledgeable Google News Guide 3 called
Abe, don't worry, he really is legitimate :). And with that -- have
fun, post freely, and together we'll make news more universally
accessible.
Glad to see Google being more open about these topics.
A couple of thoughts . . . if a breaking news story doesn't have a lot
of information available yet (i.e. a celeb died, a plane crashes,
etc.) does a news site get penalized for updating that story once more
information is available? If you have hundreds of reporters . . . an
article could be updated several times in a 5 minute period with
useful information. Is this going to cause Google News issues?
Surely Google wants to have the most updated content and we as
publishers would like to provide it. Updating an existing story is a
very common journalism practice.
The Tribune Network has a ton of news photo galleries. Some of which
are 'news in pictures' type of themes. A picture can often times be
the most compelling portion of a story (i.e. hundreds of examples
related to weather or tragedy). A simple caption on a powerful photo
of the event could be much more interesting to the Google searcher
than a couple of paragraphs trying to explain the event. Can you
clarify where the line is drawn on this issue? How much text is
Google looking for on a page to get the page included in Google News?
Telling photographers to add longer captions would be well worth it if
it meant getting it included in Google News. The same is true for
video producers and the text associated with that video on the page.
You mention that if you drastically change the layout of your site
that it may cause problems. This makes me think that perhaps the bot
for Google News is wildly different than Googlebot itself . . . but I
won't get into that. My concern is that the page you point us to
(http://www.google.com/support/news_pub/bin/topic.py?topic=11673)
doesn't have an option for 'My site changed design/layout.' If you
are telling us that design and layout could cause Google News problems
then we should have an option that encompasses such when we attempt to
notify Google.
Again . . . I am pleased to see Google publicly releasing information
that is granular in nature. I hope this is a continued trend.
Great post! Can you talk about the important of numerals in the URL?
It looks like there is a great advantage to having at least a three
digit number in the URL. For a lot of sites (like those on Wordpress
using SEF URLs) this is a very important point.
Thanks for the segue, vcore! Recently, there's been some U2U buzz
about the three digits required in order for articles to be crawled
for Google News. To catch up on this lively conversation, please
check
it out here:
I'm pretty excited about the timeliness of this thread, because it's
exactly the sort of topic we want to take the opportunity to discuss
here. On that note, here are some points of clarification:
* First, let me confirm that three digits are a requirement in
order for your content to get crawled for Google News.
In the thread linked above, Richard's example qualifies
because it includes a date.
* It should also be noted, as Nengorama points out, a URL with
four consecutive numbers that look like a year would not
qualify unless, as in Richard's example, there are additional
numbers, such as a day or month (this should be good news
for Jeff, since it sounds like his URLs may qualify after all).
* The reason behind this requirement is that it helps the crawler
distinguish news content from other content on the site. Many
news stories already include numbers as a way to create a
distinct URL for each new article, while the URLs for most
non-news content (for example, Terms of Service or About
Us) do not include such a number.
Now, for Damone, Christina, and others, I do have some good news.
There's a way around this! When you create a News sitemap (and please
take note that this isn't the same as a Web sitemap), you're asked to
enter the URLs for all of your news stories. When submitting articles
this way, the three digit requirement doesn't apply. You can keep
your
"clean, human readable URLs" if that's what works for your site. For
more info about creating a News sitemap, we encourage you to browse
this section of our Publisher Help Center:
Plus, when you submit a News sitemap, you have the added bonus of
being able to associate meta-information with each article -- such as
publication dates or keywords for classification, which can certainly
help the success of your crawl.
Great idea. Now, please tell us what we need to improve our ranking.
The blog article tells us what the myths are but it doesn't even touch
on what it is that Google is looking for.
Do you prefer a city in the opening paragraph? An organization?
What?
I have added sitemaps, written all original material, followed every
rule Google has published only to see fewer and fewer articles get
posted or even linked to.
Another area of big concern by users on this forum... Why do "News"
sites end up in the "Blogs" area of Google when, they are not blogs at
all? Why do some sites get called "Press Release" sites when, in fact
they are not?
How does somebody get out of the blogs area and back into the news
area? How does somebody correct Google's classification of their
site? It appears that once Google's algorhytms say your site is a
press release site or a blog, that is what you will be forever.
There are many issues that need to still be answered that I don't
see. While the article clears up some misunderstandings (many that
have already been discussed here), it fails to address the concerns of
many in this forum. Will there be an update soon? Please!
> We recently made a post in the official Google News Blog on what does
> (and does not) improve a story's rank in Google News. If you haven't
> seen it yet, read and enjoy:
> Now we want to hear from you! Just a quick reminder that this is
> primarily your group, a forum to foster your own discussions; that
> said, I'll be around to help in the quest to separate fiction from
> fact, myth from truth, and urban legend from... whatever isn't an
> urban legend.
> Additionally, I'm happy to announce a new guest poster in the group.
> So if you see a surprisingly knowledgeable Google News Guide 3 called
> Abe, don't worry, he really is legitimate :). And with that -- have
> fun, post freely, and together we'll make news more universally
> accessible.
I have a quesiton regarding AP news. We put a lot of AP news on our
site. Should i be concerned about the duplicate content penalty when
it comes to wire service stories that many news sites use?
> We recently made a post in the official Google News Blog on what does
> (and does not) improve a story's rank in Google News. If you haven't
> seen it yet, read and enjoy:
> Now we want to hear from you! Just a quick reminder that this is
> primarily your group, a forum to foster your own discussions; that
> said, I'll be around to help in the quest to separate fiction from
> fact, myth from truth, and urban legend from... whatever isn't an
> urban legend.
> Additionally, I'm happy to announce a new guest poster in the group.
> So if you see a surprisingly knowledgeable Google News Guide 3 called
> Abe, don't worry, he really is legitimate :). And with that -- have
> fun, post freely, and together we'll make news more universally
> accessible.
Why did you delete your reply to me from a couple of days ago? There
was some really good information in there that may have been helpful
to the group.
> Thanks for the segue, vcore! Recently, there's been some U2U buzz
> about the three digits required in order for articles to be crawled
> for Google News. To catch up on this lively conversation, please
> check
> it out here:
> I'm pretty excited about the timeliness of this thread, because it's
> exactly the sort of topic we want to take the opportunity to discuss
> here. On that note, here are some points of clarification:
> * First, let me confirm that three digits are a requirement in
> order for your content to get crawled for Google News.
> In the thread linked above, Richard's example qualifies
> because it includes a date.
> * It should also be noted, as Nengorama points out, a URL with
> four consecutive numbers that look like a year would not
> qualify unless, as in Richard's example, there are additional
> numbers, such as a day or month (this should be good news
> for Jeff, since it sounds like his URLs may qualify after all).
> * The reason behind this requirement is that it helps the crawler
> distinguish news content from other content on the site. Many
> news stories already include numbers as a way to create a
> distinct URL for each new article, while the URLs for most
> non-news content (for example, Terms of Service or About
> Us) do not include such a number.
> Now, for Damone, Christina, and others, I do have some good news.
> There's a way around this! When you create a News sitemap (and please
> take note that this isn't the same as a Web sitemap), you're asked to
> enter the URLs for all of your news stories. When submitting articles
> this way, the three digit requirement doesn't apply. You can keep
> your
> "clean, human readable URLs" if that's what works for your site. For
> more info about creating a News sitemap, we encourage you to browse
> this section of our Publisher Help Center:
> Plus, when you submit a News sitemap, you have the added bonus of
> being able to associate meta-information with each article -- such as
> publication dates or keywords for classification, which can certainly
> help the success of your crawl.
> Thanks for the segue, vcore! Recently, there's been some U2U buzz
> about the three digits required in order for articles to be crawled
> for Google News. To catch up on this lively conversation, please
> check
> it out here:
> I'm pretty excited about the timeliness of this thread, because it's
> exactly the sort of topic we want to take the opportunity to discuss
> here. On that note, here are some points of clarification:
> * First, let me confirm that three digits are a requirement in
> order for your content to get crawled for Google News.
> In the thread linked above, Richard's example qualifies
> because it includes a date.
> * It should also be noted, as Nengorama points out, a URL with
> four consecutive numbers that look like a year would not
> qualify unless, as in Richard's example, there are additional
> numbers, such as a day or month (this should be good news
> for Jeff, since it sounds like his URLs may qualify after all).
> * The reason behind this requirement is that it helps the crawler
> distinguish news content from other content on the site. Many
> news stories already include numbers as a way to create a
> distinct URL for each new article, while the URLs for most
> non-news content (for example, Terms of Service or About
> Us) do not include such a number.
> Now, for Damone, Christina, and others, I do have some good news.
> There's a way around this! When you create a News sitemap (and please
> take note that this isn't the same as a Web sitemap), you're asked to
> enter the URLs for all of your news stories. When submitting articles
> this way, the three digit requirement doesn't apply. You can keep
> your
> "clean, human readable URLs" if that's what works for your site. For
> more info about creating a News sitemap, we encourage you to browse
> this section of our Publisher Help Center:
> Plus, when you submit a News sitemap, you have the added bonus of
> being able to associate meta-information with each article -- such as
> publication dates or keywords for classification, which can certainly
> help the success of your crawl.
I was trying to delete a duplicate entry and must have unintentionally
deleted both copies -- thanks for the catch! For everyone else in the
group, here's the post Brent is referring to:
When we say that updating articles in Google News will cause a
"problem", we don't mean that we're penalizing you in any way. As we
mention in the blog post, Google News only crawls a URL once. If you
update an article after we've crawled it, those changes won't
currently be reflected in the snippet. Brent's right, we do want the
most updated content, and we're working on some changes that should
improve this coverage.
We totally agree that pictures are important (that's why we created an
image-only version of Google News!) In general, the best way to get
photos included in Google News is to attach them to articles. If
you're concerned that an article or caption is too short to be
included in Google News, you can head over to Webmaster Tools and
check out the error reports in your News crawl section:
If an article doesn't contain enough text to be included, Webmaster
Tools will let you know.
When we say that changing the layout of a site may cause inclusion
problems, we don't mean new font colors or backgrounds might throw off
the crawler. We're referring to technical changes that often occur
during a redesign. If your site undergoes a redesign, it's a good rule
of thumb to double-check the technical guidelines in our Help Center.
If you do have a significant drop in coverage that lasts for more than
a brief period, please let us know:
Good question, stepppo! Duplicate detection is merely meant to filter
out multiple copies of the same story and attribute these stories to
the original source. As long as you're publishing original news
content, including content aggregated from a wire service (such as AP)
shouldn't have any effect on the rank of your site's original stories.
> Good question, stepppo! Duplicate detection is merely meant to filter
> out multiple copies of the same story and attribute these stories to
> the original source. As long as you're publishing original news
> content, including content aggregated from a wire service (such as AP)
> shouldn't have any effect on the rank of your site's original stories.
> I was trying to delete a duplicate entry and must have unintentionally
> deleted both copies -- thanks for the catch! For everyone else in the
> group, here's the post Brent is referring to:
> When we say that updating articles in Google News will cause a
> "problem", we don't mean that we're penalizing you in any way. As we
> mention in the blog post, Google News only crawls a URL once. If you
> update an article after we've crawled it, those changes won't
> currently be reflected in the snippet. Brent's right, we do want the
> most updated content, and we're working on some changes that should
> improve this coverage.
> We totally agree that pictures are important (that's why we created an
> image-only version of Google News!) In general, the best way to get
> photos included in Google News is to attach them to articles. If
> you're concerned that an article or caption is too short to be
> included in Google News, you can head over to Webmaster Tools and
> check out the error reports in your News crawl section:
> If an article doesn't contain enough text to be included, Webmaster
> Tools will let you know.
> When we say that changing the layout of a site may cause inclusion
> problems, we don't mean new font colors or backgrounds might throw off
> the crawler. We're referring to technical changes that often occur
> during a redesign. If your site undergoes a redesign, it's a good rule
> of thumb to double-check the technical guidelines in our Help Center.
> If you do have a significant drop in coverage that lasts for more than
> a brief period, please let us know:
Hello Marcela,
Glad to see Google coming through for the publishers. Many thanks.
A couple of doubts. These days Google News seems to favor US-based
sites no matter how frivolous they are, in ranking the news. This
trend seems true after the algorithm changed in Sept-Oct. This is not
a rant, but is anything going to be done about this?
One of the best things for me as a publisher as well as a consumer was
seeing a variety of sites hit the front-page (I am sure many here will
agree with me). Now just a handful of sites are featured...hence this
query.
Second, big publishers like NYT publish the same news with the same
headlines some hours after the original appeared on GN. If small
publishers were to put the same article in once again, will they be
punished?
Out of curiosity, what happened with the torch protest news that
didn't make it to Google News the other day in London? Was it some
sort of technical fluke?
> We recently made a post in the official Google News Blog on what does
> (and does not) improve a story's rank in Google News. If you haven't
> seen it yet, read and enjoy:
> Now we want to hear from you! Just a quick reminder that this is
> primarily your group, a forum to foster your own discussions; that
> said, I'll be around to help in the quest to separate fiction from
> fact, myth from truth, and urban legend from... whatever isn't an
> urban legend.
> Additionally, I'm happy to announce a new guest poster in the group.
> So if you see a surprisingly knowledgeable Google News Guide 3 called
> Abe, don't worry, he really is legitimate :). And with that -- have
> fun, post freely, and together we'll make news more universally
> accessible.
We're glad that you appreciate the diversity of sources we try to
foster on the Google News front page. That diversity is extremely
important to us, and we think it allows our users to gain a unique and
valuable perspective on the day's news.
We also recognize that regional editions are important to our users,
so we try to strike a good balance between showing articles from a
wide range of publishers, while still displaying some of the best
articles about stories our users care about the most. Often, that
means showing articles from publishers specific to a given region.
This is why, if you look at the US edition of Google News, you see a
lot of US sources; while the Canadian edition shows articles from
predominantly Canadian publishers.
We hope we do a good job of showing diverse, interesting, relevant
stories, but we know we can always do better. So keep giving us your
feedback, and let us know when we're getting the mix wrong!
Thanks for addressing my confusion/concern about the <a href="http://
groups.google.com/group/news-HelpPublishers/browse_thread/thread/
3929d7437c6b5932">unique digits in URL's</a>.
While I'm still a bit confused about the policy, you provided an
excellent workaround: A News Sitemap. I will certainly use that in
place of worrying about my digits when I'm ready to submit content to
GNews.
One remaining point of confusion:
The way I read the FAQ, each web page must have a unique number not
repeated in other URLs. The LA Times example I gave had two numbers in
it: 2008 and 04. It would seem that unless they only published one
article in the entire month of April, such a URL wouldn't be
considered unique. However, you said it does qualify.
On a side note, I apologize if I came off a little crotchety in my
original post (which has since disappeared). I was a little shocked at
how "Ungoogle" the policy seemed. The Google I know doesn't require
site owners to publish in a particular fashion, especially when it has
the potential to stifle innovation.
Now that I know that the sitemap option (which I was planning on using
anyway) eliminates that requirement, I'm a happy camper again!
> Thanks for the segue, vcore! Recently, there's been some U2U buzz
> about the three digits required in order for articles to be crawled
> for Google News. To catch up on this lively conversation, please
> check
> it out here:
> I'm pretty excited about the timeliness of this thread, because it's
> exactly the sort of topic we want to take the opportunity to discuss
> here. On that note, here are some points of clarification:
> * First, let me confirm that three digits are a requirement in
> order for your content to get crawled for Google News.
> In the thread linked above, Richard's example qualifies
> because it includes a date.
> * It should also be noted, as Nengorama points out, a URL with
> four consecutive numbers that look like a year would not
> qualify unless, as in Richard's example, there are additional
> numbers, such as a day or month (this should be good news
> for Jeff, since it sounds like his URLs may qualify after all).
> * The reason behind this requirement is that it helps the crawler
> distinguish news content from other content on the site. Many
> news stories already include numbers as a way to create a
> distinct URL for each new article, while the URLs for most
> non-news content (for example, Terms of Service or About
> Us) do not include such a number.
> Now, for Damone, Christina, and others, I do have some good news.
> There's a way around this! When you create a News sitemap (and please
> take note that this isn't the same as a Web sitemap), you're asked to
> enter the URLs for all of your news stories. When submitting articles
> this way, the three digit requirement doesn't apply. You can keep
> your
> "clean, human readable URLs" if that's what works for your site. For
> more info about creating a News sitemap, we encourage you to browse
> this section of our Publisher Help Center:
> Plus, when you submit a News sitemap, you have the added bonus of
> being able to associate meta-information with each article -- such as
> publication dates or keywords for classification, which can certainly
> help the success of your crawl.
We're glad you're happy with Sitemaps, and we definitely encourage you
and other publishers to use them as extensively as possible. They're
by far the best way to make sure that Google News includes the
articles you want. And as we mentioned above, when we're not reviewing
stories submitted via Sitemaps, we need to use our three-digit rule to
help determine what is and isn't a news story.
In response to your question, Google News looks to see that each
article URL is unique, but the individual digits within two URLs can
be the same. For example, if you had articles at newspublisher.com/
2008/04/..., we'd be able to crawl all of them just fine. But if they
were at newspublisher.com/2008/..., we wouldn't be able to include
them.
> We're glad that you appreciate the diversity of sources we try to
> foster on the Google News front page. That diversity is extremely
> important to us, and we think it allows our users to gain a unique and
> valuable perspective on the day's news.
> We also recognize that regional editions are important to our users,
> so we try to strike a good balance between showing articles from a
> wide range of publishers, while still displaying some of the best
> articles about stories our users care about the most. Often, that
> means showing articles from publishers specific to a given region.
> This is why, if you look at the US edition of Google News, you see a
> lot of US sources; while the Canadian edition shows articles from
> predominantly Canadian publishers.
> We hope we do a good job of showing diverse, interesting, relevant
> stories, but we know we can always do better. So keep giving us your
> feedback, and let us know when we're getting the mix wrong!