5:31pm

iPhone-optimized IHT.com launches

While our American colleagues are enjoying this Fourth of July by gorging on delicious BBQ and tossing each other into swimming pools, here in France it’s just another day. Since we’re stuck in the office — on a wonderful Parisian summer day, no less — we decided to launch our iPhone-optimized version of IHT.com.

Point your iPhone or iPod Touch to: http://iphone.iht.com and take a look.

As you can see, this is an optimized version of the web site, and not an “iPhone application”, which is what they were called before Mr. Jobs decided to open up the iPhone for development. Now, they’re “native apps” which you obtain through iTunes and may or may not have to pay for. We don’t have any plans to develop a native app.

The iPhone site is still very much a work in progress. We wanted to get the core functionality in place first so the user can access nearly all of the content on the regular IHT.com site. You’ll notice that we’ve decided to build the entire site using AJAX (as opposed to published pages), and while this allows the speed to be quite zippy on Edge and blazing fast on Wi-Fi, it also uncovered numerous bugs and usability issues.

The most glaring omissions are the lack of functionality of Safari’s back/forward buttons and bookmarking. The former is a very difficult nut to crack, as we’ve done extensive research and experimented with several JS libraries. While we were able to get browser history working, the JavaScript process required is extremely processor-intensive and caused a 15-25 second delay in the display of every new page requested. It was too detrimental to the experience to retain the functionality. JavaScript is notoriously slow on iPhones, and I hope that Apple addresses this some time in the future, unless of course this has been Apple’s plan all along: to cripple Safari knowing that companies would pay them real money to develop native applications that harness the full power of the phone. In any case, we felt that it’s less of a dealbreaker and more of an annoyance. This one is on the top of our bug list.

Bookmarking is somewhat related, and Google’s Really Simple History claims to address both this and the previously mentioned issues. We couldn’t get it to work properly, but we can think of a few ways to tweak our code so you can “drill” into the AJAX to access specific pages. Once we implement this properly we’ll also add an “E-Mail Article” feature, so keep an eye out for that.

Another minor annoyance is Safari’s address bar, which auto-hides on most sites but stopped working reliably on ours. In earlier builds of the site, it auto-hid as it should when the page finished loading, but somewhere along the line it stopped disappearing. We suspect it has something to do with AJAX processes running when the page thinks it’s finished.

There’s a short list of other issues that we’ll be addressing in the next few weeks, but if you find anything, feel free to report it here.

That’s it for now…we’re off to find an expat barbecue to crash.

3:09pm

Some answers to your questions

A few weeks ago we opened up the request lines to suggestions, and while the response was by no means overwhelming (I mean, who reads this blog anyway?), there were a few questions and concerns that I’d like to respond to:

Nikos writes:

Would it be possible for you to identify the source (IHT, New York Times…) of the stories and editorials you publish as you once did? It would make it a lot easier of those of us seeking original IHT content.

Sharp-eyed readers will notice that this is a feature we used to have. However, as The International Herald Tribune is officially “the global edition of The New York Times” (as you can see in our revised logo up top), it was decided that in order to achieve tighter editorial integration this differentiation was no longer necessary. At the moment there are no plans to reimplement the source line in this way.

Dave writes:

I wonder if there is a way to personalise RSS feeds? Something akin to Google news, where I would be notified of any article containing one or more key words. i subscribe to several of your feeds, but would relish the idea also being able to have an RSS feed that will pick out any article in any section that refers say to Ukraine, be it in culture, europe, business, etc.

This is a fantastic idea, and as an RSS junkie I can say that this one’s been on our dev team’s wish list for quite some time. After analyzing the concept in depth, we’ve discovered that the main roadblock is our content management system, specifically the way it handles keyword metadata. Since our articles come from a variety of sources (IHT, AP, Reuters, etc) the quality of keyword data being stored in the system varies greatly. Inconsistent data formating is a developer’s nightmare, and for that reason we can’t move forward until there’s a reliable, unified system. If implemented now, this feature’s reliability would be questionable at best.

The other option is to allow users to construct their own keyword-based feeds. While this is certainly possibly it requires a significant amount of server horsepower to handle thousands of simultaneous queries (replete with misspelled queries, etc.). It is beyond our servers’ capabilities at this time. Perhaps we’ll discover an elegant solution for this, but for now it will remain near the top of our wish list. Come to think of it, it would be nice if you could RSS-subscribe to a search result in Google Reader.

Juurd writes:

I love the audionews/readspeaker feature, listening to the IHT while driving my car is very practical. Unfortunately, since about a week or two the articles, or rather the podcasts of them, stopped streaming into my iTunes - while other podcasts, from The New York Times for example, still keep coming. I don’t know whether the problem is with my computer or my settings, or with the iht.audionews.

We apologize for recent outages regarding our feeds and AudioNews. Major server upgrades occurred over the past few weeks and unfortunately the communication between our server and AudioNews’ became unstable, resulting in few updates. Everything is back on track now, we hope, but in the coming days we’ll be looking for more of the inevitable quirks that arise with these kinds of upgrades.

12:00am

Resolving post scheduling issues in WordPress

We recently migrated iht.com to a new server farm and network and in the process some funny things started to happen in WordPress when our bloggers were trying to schedule a post in the future using Wordpress. They complained about the scheduler stating that the post would be published in x minutes yet when the alotted time passed, guess what, nothing happened. We found several others out there that were having the same problem. Eventually, we were lead to a very helpful post on the WordPress support forum that helped us solve the problem. Specifically, our particular problem was addressed by a contributor, liberalgeek.. The problem that we were having is exactly as he/she states. Our load balanced setup does not allow requests to go out from a particular node and then return, hence wp-cron.php could not request itself properly. We easily fixed this problem by changing our hosts files to map the domain name to the internal IP address. Thanks liberalgeek! The fact that you are reading this is post is evidence the the issue is resolved.

We are running an older version of WordPress. If this is fixed in a future version of WordPress — and I hope it is — we did not find which version. Besides, upgrading WordPress on a site like ours is not a trivial task. Making a slight change to the hosts file is the kind of simple solutions that the developer team at iht.com really like.

3:00pm

Request lines are open

In our ongoing efforts to make improvements to iht.com, it’s sometimes easy to get trapped within our little internal bubble of design and development. Sure, we’re constantly redesigning sections and adding new features, but nearly all our ideas come from within. So what would our visitors like to see changed?

We like to think the Developer Blog provides a sort of direct line not only to our developers, but also to the design and user experience team. Since the blog began, and especially in our previous entry, we received comments and emails which provided us with some very useful feedback and suggestions, so now I would like to continue that by opening the floor with a more focused experiment.

We have some new gadgets and gizmos planned for the upcoming freshening of our article-level page (NOT blogs), and we’d love to pick your brain for more ideas. Is there anything you think we should kill, modify, or add? An interesting bit of functionality that would make the page even more useful? Let us know in the comments…

4:54pm

Who says type can’t be improved on the web?

When you spend your days in the company of people who’ve spent decades working on print publications, you’re bound to hear complaints about type on the web. XHTML and CSS don’t really accommodate for the occasional orphans and widows (lone words on the first and last lines of a paragraph, respectively), odd ragging (whitespace at the right of each line) and rivers (whitespace which creates a crooked visual line vertically through a paragraph).

When a site automatically publishes thousands of articles a day through various data feeds, there just isn’t much you can do. On the web, designers have learned to let it go, and most developers just don’t care.

This week Typesites.com did a very flattering write-up on IHT.com and its use of type on the web. One of the criticisms listed was the fact that we do have widowed words in or headlines and paragraphs. Normally this would be placed on our low-priority list of improvements, but Typesites’ John Arnor G. Lom suggested a rather elegant way to fix it: put a non-breaking space between the last two words of every paragraph and headline.

Today we did just that. With a little bit of regex magic in our .JSP template (finding “[ ](\\w+)(\\S*)</p>” and replacing it with “&nbsp;$1$2</p>” to be exact) we were able to successfully rid ourselves of those pesky widows forever. Of course, the result isn’t going to be perfect every time, but it’s a subtle and pleasant improvement that should satisfy those print curmudgeons out there.

Here’s an example with a headline:

And now with the body text:

Update: I forgot to add that this fix is not retroactive. It only applies to articles published or republished after about 16h00 Paris time on Tuesday the 6th. Also, it does not apply to IHT’s blogs.

5:26pm

Listen to IHT articles while doing other things

Did you know that you can listen to any article in its entirety? It’s no April Fool’s joke. Faithful readers of IHT.com have have been using our “Listen to Article” feature for a long time, with more and more discovering it every day.

We launched the feature alongside the current iteration of our article design on November 1, 2006. Partnering with ReadSpeaker (who also powers our unique AudioNews application), the IHT was one of the first major news sites in the world to allow visitors to listen — rather than read — allowing them to explore other parts of our site or others.

"Listen to Article" tool

The way it works is simple: In any article’s tool box, click “Listen to Article”. A small window containing a Flash-based media player will pop up. Almost immediately, the full audio of the article will be read to you by a surprisingly convincing computer-generated voice. While not perfect 100% of the time, the voice is impressive in that it handles various international names and abbreviations marvelously. We’ve come a long way from the days of the garbled gibberish of Speak & Spell!

One of the key decisions we made early on was to make the feature pop up in a window rather than play inline within the article page. Our reasoning was the same as why we choose to pop up video in its own window: why keep visitors glued to one page when most people would rather be multitasking?

The one tricky bit that we faced was — since “Listen” scrapes the page’s HTML for custom tags — how to get it to work properly on multi-page articles? Since the Flash media player asks for a URL as a parameter, we simply feed it the article’s printer-friendly URL which contains the entire article body.

We think the “Listen to Article” is one of IHT.com’s hidden gems, so check it out and let us know if the user experience works well for you.

7:00pm

Imitation is not always the sincerest form of flattery

We’re wondering if the folks at Rising Kashmir knew what they were paying for when they hired Sanguine Infotech Pvt. Ltd. to design their site.

Legal issues aside, if you are going to copy our design nearly verbatim, at least get it right!

fail.

11:59am

RSS now and forever

A few months ago I wrote a post called “RSS Everything” where I stated one of our goals to make an RSS feed for as many things on IHT as we can. In keeping with the spirit of that post on Wednesday we rolled out some new functionalities for generating our RSS feeds. FYI, I use the word “RSS” collectively to include Atom feeds as well, we publish both. More and more pages are being generated at IHT.com based around events, like Davos, or the Olympics. These pages contain a wealth of pre-categorized content and tend to attract engaged readers but they were lacking dedicated RSS feeds. We have changed that and also trigger the creation of all of our feeds at the time the related index pages are published.

To create our feeds we are using a java package, Rome, that does most of the work. At first we found it a little clumsy to work with since most of the tutorials for creating feeds, as opposed to reading feeds which Rome can do as well, were very generic. Rome includes a generic object that will easily create feeds of various formats and versions for you but as soon as you want to use format specific features Rome requires you to use objects for that format. Since we always do things the hard way this is what we did. The beauty of the package is that once you start getting things working its incredibly easy to build feeds in any way you would like. Bravo to the developer team of this package.

RSS plays a very important role for us. By far more traffic is to these files than any other files on our site. Granted, a large percentage of that traffic is non-human traffic but it does say a lot about how people prefer to consume our content. Enjoy our feeds. They are for you.

4:58pm

Google webmaster tools

We recently had an episode with google news that cost us a lot of good exposure and a fair amount of traffic. Apparently, the google news crawler is much more restrictive than the normal googlebot. Some of the people from our newsroom had noticed that we were no longer being featured on google news. As this was happening during the Société Générale scandal we took this issue very seriously. Using our google webmaster account to look at some diagnostics we discovered thousands of errors with “Article too long”. It took us a few hours to track down the source of our problems. We had, as a part of our article template, a comic-strip-like html construct that was serving as our “video box”. The box acts like a view port and lets the user click through the video offering for that section in a very easy way. As it turns out our list of videos was well over 100 for some sections and this seems to have upset the google news crawler. Google was interpreting this long list of video assets as a way of trying to cheat the system and subsequently blacklisted us from Google news.

Our response: pull down the videobox immediately across the whole site and reprogram the thing. Dedicated readers may have noticed that the box was missing for a few days but now it is back up with a list of 10 videos. Within hours our content reappeared in google news and everybody happy. There is an important lesson in this for all product development people out there and that is this: just because you can does not mean that you should. Filtered content and manageable lists are better than throwing everything at the user.

For those of you who access this site through google news please accept our apology.

7:27pm

Putting blogs into context

It’s been a while since we’ve added anything to the Developer Blog. Holidays, vacations, and a mad rush to launch the new Business with Reuters section left little time to come up with cool new widgets for the site. But the new year is finally here, and as we slowly return to full staff the ideas will start to flow once again.

When we relaunched the new Business section Monday, we needed to accommodate some changes to the homepage. The bulk of what’s changed is in the C-column: 1) the addition of a Business with Reuters promotional box, 2) the relocated Reuters news feed box, and 3) the addition of a new Market Tools widget below Skybox 3 (designed and served by Wall Street on Demand). With the creation of this new Reuters “zone”, it was clear that something needed to be done about the “In Blogs” box.

Over the past few months we’ve run multiple CrazyEgg click tests and reports showed that the Blogs box was performing dismally. As a result, blog traffic suffered, but not for lack of visibility. We came to a conclusion which was similar to our thoughts on the homepage Reader Discussion box:

Visitors are not coming to traditional news sites to peruse a bucket of blogs, so context is more important than visibility.

To apply this to the rest of the site, “Blogs” shouldn’t be the product so much as each blog should be a feature of its respective parent section of the news. This should be reflected in how they are promoted at the article and section front level.

Referring once again to our click test results, we noticed that the news headlines at the bottom of the narrow column are some of the most clicked elements on the entire page, often surpassing some of the ranked stories. It became instantly clear that this “hot spot” on the page could be leveraged to generate blog traffic as well.

What at first glance seems like a gimmick to draw clicks actually serves a valid purpose to the visitor: it puts individual blog entries in the context of “The News”. Our blogs are written by some of the same journalists who write news stories, and many blog entries are about what’s in the news, so why differentiate the two products?

“Blogs” are just another method to package and publish content; the stigma of blogs having a less serious tone is in the past, at least in the corporate setting.

The way the new “In Blogs” section works is simple: every hour, a cron process queries the Wordpress database and finds the two which feature the most recent new entries. Notice, I didn’t say updated entries; that would have made it too easy to game the system, and a blogger-brawl would soon ensue over who out-updates who. The new setup is not without ethical implications, as I will soon explain, but the initial traffic reports show some very promising results.

In the first few days, all blogs have received a traffic increase ranging from 52% to an astounding 180% above their respective daily averages. Obviously this is something we will watch closely over time, but fresh new click tests demonstrate that visitors are indeed clicking the blog headlines, just as we predicted. And because people are going directly to an entry page, their next step is to start exploring what else the blog has to offer, resulting in even more page views.

The previous “In Blogs” box needed to be updated manually by a person, and only one blog would get to be “promoted” while the rest were simply listed. The new process eliminates the need for manual labor and also introduces a fairness factor, for better or for worse: active blogs are rewarded with top exposure and traffic to be sure, but less active blogs will still benefit when their entries are published.

Because the system is automatic, the door is open to “gaming”; a clever blogger could use a few strategies for gaining maximum exposure. For example, with “The Price is Right” method, blogger A posts an entry at 8:45, blogger B posts at 8:46, blogger C bumps blogger A off the list by posting at 8:47. There are several more underhanded strategies that I don’t need to list, but we’re hoping the goodness of mankind overcomes and everyone just gets along. The web, of course, is a 24-hour operation so there’s no need for every blog to post during the same hour.

That’s about it for now. Blog away.