LibraryThing: May 2007

Monday, May 14, 2007

LibraryThing for Libraries in Danbury

"You got your chocolate in my peanut butter!"

Over in Thingology I've announced the first library to use LibraryThing for Libraries—The Danbury Library in Danbury Connecticut. Works, recommendations, tags—they've got it all.

I've said I wouldn't do as much cross-posting, now that we have a combined blog feed (see over on the right). But I thought I'd mention it here, and explain a bit about what it means for LibraryThing.

First, as members of LibraryThing, you should feel proud that your data—anonymous and aggregate, as the Terms of Use say—is helping library patrons to find books. Your passions—the books on your shelves—beat statistical "paths" through books that others can follow. Your tags--the way you think about your stuff--will help people find subjects not covered by traditional subject classification.

For those concerned about development time, I want to emphasize that LibraryThing for Libraries is good for LibraryThing. On the most basic level, it's going to help our bottom line. That means more programmers making features and fixing bugs. Conceivably, it could mean cheaper accounts.

It also deepens our relationship with libraries, and returns a favor. LibraryThing was built on library data, and we've been graciously invited into the library conversation. We are charging for LibraryThing for Libraries, but our prices are in an entirely different league from what libraries are accustomed to pay for their online catalog software. And as these catalogs add "social" features, LibraryThing for Libraries will exert powerful downward pressure on prices. Ultimately, the industry needs a newcomer to take a huge slice of a smaller market. We're not going to be that company, but we can push the trend along.

LibraryThing for Libraries has also taught us a lot about library catalogs. These are some thorny, mysterious systems! Until now, we've relied exclusively on the simplicity of Z39.50 connections, which most libraries don't have. But we can do more. With out new-found experience, we can start connecting to the remaining 95%. If nothing else, this should help our language reach.

Wednesday, May 09, 2007

A very short introduction

At long last, the often requested quickstart guide—A very short introduction to LibraryThing.

It's intended as a quick overview of LibraryThing's features, to help new members get started, all the way from signing up to creating a blog widget. It's hard to come up with the balance of enough information to help without overwhelming, so I'm looking for your feedback. What should be added, changed, deleted, clarified...?

Discussion in this talk post.

Stars in reviews

Here's a low-hanging fruit. We finally put the review's star rating in the reviews. I think I'll call it a "mashup."

From The Da Vinci Code:

Labels: new feature, reviews

New search, now with "working-ness!"

I've changed how the "all fields" search for your library works. It's new and still being worked on—you can discuss problems and requests here on Talk. But it's faster, solves most character set issues and allows "fielded" queries.

Example queries:
greek history
"greek history"
greek history -war -"peloponnesian war"
gree* history
*disestablishmentarianism
tag: greek author: homer
title: finger* subject: pick-pockets
source: amazon all: history

Update: It supports "all," "tag," "title," "author," "ISBN," "subject," "dewey," "LCCN," "source," "date," "review" and "comment." (You can use plural for all names too.) By default, it now uses the field "most," which is "all" minus subjects, reviews and comments.

Labels: bugs, features, search

Monday, May 07, 2007

Going to Book Expo America in New York

Book Expo America, ABA's annual book industry trade convention is in New York City this year, and I (Abby) am going to be there. I'll be speaking on Thursday, May 31st (from 1-2pm—mark your calendars!) on a panel called "Using Social Networking to Build Author Brands."

We just found out that the our competitor, Shelfari, is also going to be at BEA this year, and is apparently using some of their Amazon funding to co-sponsor an event. Hey! Well, not only does LibraryThing appear to have sixty-five times as many book lovers as them, but we think we have a lot more to offer authors, booksellers and publishers and we're going to prove it.*

Authors. It was at last year's BEA that we launched the LT Author program. After Tim and I spent a day walking around trying to describe LT in a nutshell**, we realized we had been telling people, "it's like MySpace, but for booklovers." Well, MySpace is all about bands and musicians promoting their music. Wouldn't LibraryThing be a good place for authors to do the same? What better place to promote your new book than a website full of avid bibliophiles?

And so was born the LT Author button, a shiny yellow badge that connects an author's "author page" with their profile page. So far LibraryThing has snagged 395 authors. (See the complete list.)

Best of all, they're not just authors who clicked a box. To be part of the program, you have to have a LibraryThing account and put in at least 50 books. What is your favorite author reading? Find out.

Neil Gaiman's author photo. Members have added over 15,000 pictures and photos of authors (see recently added ones), with alibrarian and leebot leading the pack. They deserve some kudos—it's actually a pretty intensive process, often involving writing authors, publishers, or photographers for permission, so the sheer number of photos is all the more impressive. Plus, it makes for a nice gallery. :)

LibraryThing members have also added over 92,000 links to author pages—links to author home pages, blogs, publisher pages, Wikipedia pages, interviews, articles, fan sites. That's a lot of links.

Booksellers. We'd love to add more bookstores to our "bookstores that integrate"—adding availability and pricing information on every work page. We've got only three so far, but we'll be adding two major "chunks" of them in the next few months—to at least 100 total. It's a great way for people to be able to see at a glance if a book is at their local bookstore.

Publishers. So far, we're not doing anything for publishers! But there's a big announcement coming soon. Be on the edge of your seats!

So what can we do to make LibraryThing big at BEA this year?

Our big idea so far is a par-tay. Of course, anyone and everyone can find some time to talk to me during BEA, but I'd like to have a big meet-up. Authors, publishers, booksellers, and hey—readers. Anyone in NYC who's around is invited, not just the book-industry professions allowed to go to BEA (they have to restrict it, because there's so much free merchandise on offer.)

I made a BEA 2007 group, post there with ideas of where we should meet (I'm thinking maybe a restaurant near the convention center?). New Yorkers, I call on you for suggestions!

We're also thinking about bring a bunch of CueCats, and giving them out to authors, to entice them into becoming LT Authors... What else?

*[Written by Tim] Shelfari doesn't release any statistics. But they do release the top 20 bookshelves. The 20th bookshelf on Shelfari has 1,360 books. LibraryThing has 1,378 members with that many. Hence 20/1,378 = 68.9 times as large. You will note that we do not abuse our other competitors--just Shelfari. Some of them are quite good! There's a good thread going about them. We want people to check them out, and come back to tell us how to improve LibraryThing!
**"This is me in a nutshell: HELP! I'm in a nutshell!"

(photo by Rick Dikeman on Wikipedia, under GNU Free Documentation License)

Subjects get faster; the rest will follow

Everything on the web is better if it's faster. Slow pages are a silent killer.

So we're working to speed thing up. We've long done "situational" caching. But our growth is relentless—we'll hit 200,000 registered members today—and we've had no good, generalized solution. We've recently been working on two solutions, for database and page-level caching. Together they should speed up certain cacheable pages, like works, authors and tags. The more resources we can free, the faster the uncacheable pages, like Talk, will become as well.

So far, only subject pages are being cached, eg.,

Subject pages were a big problem. The worst took a minute to load. When Google's "spider" program went at them, with one request/second, the servers would sweat. Subject pages are now cached whenever someone hits a page, and stays so for at least week.

Subjects are a test. There are some kinks to work out. (For example, changing the non-English translations doesn't immediately clear all affected pages.) Once we get where we want, we'll roll it out page-caching wherever we can use it. Query caching will follow.

Saturday, May 05, 2007

Conversation = Excellence

LibraryThing has always depended on members to set development goals and refine (or ditch) features. But it's amazing how well it's worked with the new "affinities"* feature. We simply could not have anticipated how members would shape our thinking. (I will never ever develop another project in a small, closed group, with occasional trips to watch a "focus group" from behind smoked glass.) We're still watching reactions on the blog, and on a now-130+ Talk topic, but we have some good ideas. When Altay returns from Boston, we'll hammer out changes, including customization of the look, and the ability to turn it off.

I started another thread I want to highlight, about LibraryThing's strategy and a hiring decision for the non-English LibraryThings. Do we hire someone, and what can they do? I hoping the thread gets some traction, at least among the users of our dozen-plus non-English sites. We need a non-English plan.

Part of the problem is technical, starting with better character support. But there's a feedback loop. Right now, the non-English sites can't be the coding priority because they're not contributing as much to our growth, or to our finances. (Not that they're small. Our non-English sites appear to have more action than our largest English-language competitor.) If we hired someone—and had something for that person to do—we'd have a stronger incentive to work on it.

*We called them "affinity percentiles," but it got chipped down nicely by SilentInaWay. Case in point.

Labels: conversation, features, non-English

Friday, May 04, 2007

Affinity percentiles and Altay

Altay (middle), John (sweatshirt), Tim (right), Abby (encased in her spherical "soul cage")

We're introducing an important new feature, but only just. The feature is called "affinity percentiles." Basically, we show numbers next to other user's names. These represent how "similar" your libraries is to theirs.

We've started it off on just one area of the site, the message pages in Talk (example). We plan to roll it out across the site, but not until we get a lot of feedback. I have a feeling some members will love it, but some won't. This isn't something we want to do lightly.

The number needs some explaining. (It may be too subtle, and we should fall back to a more straightforward "books shared.") Basically, the higher the better. The person who shares the most books with you will have a 99%; the person who shares the least gets a 1%.

The percentage isn't the number shared—65% does not mean a user shares 65% of their books; it means that the user shares more books than 65% of users. Two other factors come into play:

a member has to share five books to get an affinity percentile
"sharing" is weighed by book obscurity and library size. A user with 100 books, who shares 20 obscure books with you ranks much higher than a user with 10,000 books who shares some very popular novels.

Other features:

If you hover over the percentile, you'll get the shared books. We've thought of having it actually show the books.
The percentile box is colored in line with the number—the hotter the higher.

Some questions:

Are the percentiles too hard to understand; would shared numbers be better
Is the weighting confusing?
What should happen when you hover over it? When you click on it?
Where should it go? Where shouldn't it go?

How? I've wanted to do something like this for months. It's a surprisingly difficult technical problem. You can't calculate it on the fly every time, that would be insane. But caching the data gets big quick. Imagine a "Battleship" grid of users—190,000 by 190,000. If you stored a single byte for each connection--the number of shared books--it would amount to at least 16 terabytes of data (190,000 squared/2). The solution I came up with involves efficient short-term caching, and ignoring members with fewer than five shared books. We've actually been running it on the Talk pages since last night, waiting to make it visible until we knew it wouldn't melt our servers. (So far no melt!)

You'll notice the numbers aren't there when you first hit the page. They come in a second or two later. This is "Ajax" at work, and was done to prevent the new feature from slowing Talk down.

The real benefits will come when the feature is distributed across the site. I'm particularly interested in seeing affinity percentages on reviews, and sorting by them. Ultimately, I don't care what 300 people think about the Da Vinci Code. I want to know what Tim-ish people think of it.

Why? The crux of the idea is to highlight what makes LibraryThing social system work, so-called "social cataloging." Vanilla social networking is structured around "friends." That's a powerful idea, but it has limits. It can be too "binary"; and the dynamics of "friending" a stranger miss many of us. At its best, social cataloging gets at something more nuanced. If I share 50 books about ancient history with you, there's a degree, a nuance and a semantics to the connection that opens up a world of possibilities. Some are social and some aren't. I might want to chat with you about the books we've read, or I might not. Either way, I benefit. The rest of your library is probably interesting to me. And your opinions have a claim on my attention no anonymous guy on Amazon gets.

This post also introduces Altay Guvench (username: Altay), who did the Javascript work behind affinity percentiles. This was actually a toss-off, but Altay was the force behind the much more amazing Javascript in LibraryThing for Libraries. That stuff is a work of art—Javascript inserting Javascript. It might actually be self aware! Altay will be working on the site generally, with a tilt toward things that JavaScript can improve, like the widgets.

Altay in a nutshell: Portland native. Harvard undergrad. Bassist for the alt-country band Great Unknowns (toured with the Indigo Girls! Reviewed ecstatically. Listen to a free song!). Co-founder of Y-Combinator-funded startup AudioBeta. One of only three members on LibraryThing with Optical holography : principles, techniques, and applications. Scheme hacker. Nerd, but a nerd who rocks out.

Labels: affinity percentiles, altay, features, soul cages

Thursday, May 03, 2007

Combined blog feed available

I used Yahoo Pipes to make a combined feed for this blog and our Thingology blog. It was easy to do, and the result is pretty useful. The three feeds are as follows:

LibraryThing Blog

Thingology Blog

Combined Blog Feed

I also edited the employee list on the right, to add Altay. He is the magic behind the LibraryThing for Libraries Javascript, but almost nobody's seen that yet, so we're waiting for his first user feature to give him a proper introduction.

Wednesday, May 02, 2007

Many more Wikipedia citations

You'll notice many more Wikipedia links from work pages. The total has increased by about 200%, and the coverage by at least that.

This improves what I did in February. That worked by looking for ISBN patterns. Of course, not all books cited in Wikipedia have ISBNs. And even when there is one, many Wikipedia contributors omit it. (As far as I'm concerned, ISBNs look chintzy in a bibliography anyway.)

I've redone it, this time also looking for telltale title/author patterns, and running the matches against LibraryThing's vast and usefully messy dataset. The logic is somewhat fuzzy and therefore imperfect. But I haven't noticed any problems.

The number of citations expanded a lot.* Some entries exploded. Take Thomas Kuhn's The Structure of Scientific Revolutions:

Notably, it caught casual references to books, not just structured ones. For example, the article on Science wars mentions Kuhn's work in running prose, not in the bibliography or footnotes.

I haven't updated our free Wikipedia citation feed. That maps articles to ISBNs, but the new data is work-based. If anyone wants to use the new data, let me know and I'll tackle the problem. Cool as I think it would be, I haven't seen any libraries adding Wikipedia links to their catalogs yet.

*The fact that its a new feed, and the somewhat fluid interactions between ISBN-based and work-based matching make it tricky to estimate, but it looks like a 200% increase.

Monday, May 14, 2007

LibraryThing for Libraries in Danbury

Wednesday, May 09, 2007

A very short introduction

Stars in reviews

New search, now with "working-ness!"

Monday, May 07, 2007

Going to Book Expo America in New York

Subjects get faster; the rest will follow

Saturday, May 05, 2007

Conversation = Excellence

Friday, May 04, 2007

Affinity percentiles and Altay

Thursday, May 03, 2007

Combined blog feed available

Wednesday, May 02, 2007

Many more Wikipedia citations

The Other LibraryThing Blog

Feeds

Discuss

Previous Posts

Tim's blog widget

Tim on Twitter:

Tim's other websites:

Archives