Wednesday, April 7, 2010

JSON-Enabling your Web-Application: Tread. Very. Carefully.

None of what follows is new. Many issues below stem from recurring tempting circumvention techniques of cross-site data-loading security policies that were put in place by browser vendors a very long time ago, for very good reasons.

So you've created a very elegant and easy to maintain Web Application, perhaps leveraging some MVC patterns.

Your Controllers may be mapped to URLs such as these:
  • FavoritesController: list() and view(int id) methods:

    1. /favorites/list
    2. /favorites/view/1
    3. /favorites/view/2
    4. ...
  • SomeOtherControllers following similar patterns
Your web-based application works pretty well and features a decent amount of no-nonsense better-practices among which you might find: user account creation with e-mail verification and CAPTCHA, md5-hashed password authentication against a large salt unique to each user, it's devoid of very obvious SQL-Injection vulnerabilities because you're using a decent database abstraction layer, you're not allowing users to post unchecked input so you're not feeling too vulnerable to obvious XSS Attacks, you've avoided using non-idempotent RESTful URLs so you're less vulnerable to insanely obvious CSRF Attacks, but you're clearly not stopping there, by requiring all transactional HTTP POSTs to send a non-guessable and unique session-lived token as a parameter.

At this point you're pretty confident that a determined cracker will actually have to do some work to mess with your application to uncover any sensitive data it may have.

Then you get ready to Go Mobile. It only makes sense: you have a compelling application and you wish to offer your users a compelling extension of your application to their Mobile Universe, while creating additional revenue opportunities. Everybody wins.

As a design team starts putting together mock-ups of the handheld experience, which may take quite a few weeks, you realize that regardless of what they come-up with, the handheld application will almost certainly need to exchange data with the web-based application. So you figure you might as well start exposing web services to cater to various use-cases.

Your MVC stack looks neat, and now just plain screams for exposing a JSON interface to most of your Controllers. It only makes sense. Once you're done, you can send the API to the Objective-C/Java Developer who'll be building your mobile apps for Blackberries, iPhones and Droids.

So you rig your controllers to become "json-aware": instead of only being able to output HTML, you're now also able to send a JSON payload:
  • FavoritesController: list() and view(int id) methods:
    1. /favorites/list/json
    2. /favorites/view/1/json
    3. /favorites/view/2/json
    4. ...
Before doing any of this, STOP and realize this:

JSON is valid JavaScript. Valid JavaScript can be loaded in a good ol' HTML script element through any site on the Web directly from your site. Whatever your script returns, such as a list of an authenticated user's favorites, is more or less fully-accessible to any other script running on that foreign document.

Forget your Handheld/Mobile project for a second as this isn't the vector of the gaping vulnerability you've just created:

Consider http://evilsite/somebadwebpage.html with the following line of code: <script src="http://yoursite/favorites/list/json"></script>


If evilsite manages to lure some of your users, perhaps thru some crafty social-engineering sprinkled with bit.ly obfuscation, they're now in a position to capture a victim's favorites.

If you're going the JSON route, you might consider the following simple techniques:
  • Require a completely separate authentication handshake for your JSON services, they should not function if a user happens to be "authenticated on the site" while browsing from their desktop.
  • Wrap your JSON payload in a light XML enclosure: it'll be enough to make browsers barf on a JavaScript parse error should someone try to reference your JSON URI in their web document.
    <data>{jsonpayload}</data>

    Don't go loading a full XML/DOM parser, just substring/indexOf/lastIndexof your way to the JSON payload.

Thursday, September 6, 2007

Multi-Column Layouts with Smarty/PHP

Developer Kevin Sours had been looking for elegant ways to create multi-column layouts from variously-sized collections in Smarty, and couldn't quite find a plugin that would fit his needs.

As a result, Kevin wrote two plugins we're happy to release under a BSD/Open Source License to the Smarty developer community:

split:
Split works much like "foreach" but instead of iterating over the array an element at a time, it iterates over a set of subarrays of roughly equal size. The motivation is to allow multicolumn layouts without either a lot of ugly smarty markup or requiring the generating page to handle breaking the array down for the column display (which requires encoding the number of columns in the php code).


split_row:
Like split, split_row is a foreach-like structure that returns parts of an array for the purpose of doing mulicolumn layouts. The difference is that split_row returns the values in row order instead of column order.


Feel free to poke at it and give us feedback in comments.

Wednesday, August 1, 2007

Shared Library Delivery over CDN?

A slew of enabling libraries and frameworks are available to enhance functionality, usability, and interactivity of web applications we build ... mootools, prototype, scriptaculous, dojo, yui, to name a very, very few.

In the process of rebuilding our DWR-driven Used Cars Search application, we've been working on a couple of small, narrowly-scoped libraries of our own: IBDOM, Favoracious.

While these libraries afford developers tremendous agility to build advanced applications, some get large in size, and are often served directly from those applications in less-than-optimal ways, resulting in perceptible sluggishness. This is where a highly-optimized Content-Delivery Network (CDN) to serve such libraries can dramatically improve application performance.

Beyond the authoring and maintenance of an open-source UI framework, Yahoo's YUI group provides an added valuable service: They allow developers to directly include the YUI libraries from their "high-performance" Content Delivery Network.

Yahoo's inaugural blog post on the subject, goes over advantages of leveraging their delivery framework:
(...) Moreover, Yahoo!’s hosting network is configured to serve JavaScript and CSS using gzip compression. We minify YUI JavaScript before pushing it to our servers; in combination with gzipping, this results in a 90% reduction in transmitted filesize as compared to the footprint of YUI’s raw (and commented) source. CSS files weigh 60% less on the wire using gzip compression. If your current host does not support mod-gzip or mod-deflate, the advantages of using Yahoo! hosting could be dramatic. (...)

... while they also caution:
Serving YUI from Yahoo! servers won’t be the right decision for all implementers; if you’re aggregating or customizing YUI source code and serving it from a highly performant host, there will be little reason to switch. However, for some implementers the provision of free, robust, edge-network hosting will have significant upside.

Should a large number of sites elect to load the YUI libraries directly from the same Yahoo CDN URL, the caching benefits and efficiencies could be tremendous.

A person browsing the web, could load a YUI library in their browser's cache upon first visiting "Site A". Since Yahoo sets an aggressive "Expires:" HTTP Header, the user's browser will likely not even try to "revalidate" the file with a conditional HTTP GET for quite some time during subsequent visits to "Site A". Later, the same person might visit "Site B", which also happens to be loading the same YUI Library from the same Yahoo CDN URL. The browser will recognize it, realize it has it in the cache, and, in theory, not even try to revalidate it with a conditional HTTP GET. Meanwhile, "Site B" might feel "impressively fast" as it loaded quickly even though it was the user's first visit to "Site B". That's because "Site A" laid out the ground work!

... You get the idea.

Yahoo's optimizations around HTTP Performance and Caching, many of which they've outlined in their 13 Rules, ought to be a great contributing factor to limiting costs of operating their Library Content Delivery Network.

There are however many other widely-used libraries beyond the ones Yahoo authors, that could benefit from such a model. If I were to look at my browser's cache right now, I could see a dozen instances of the same scriptaculous library loaded from a dozen social networks I've visited in the past. It's getting to be silly, inefficient.

As both Library Authors and Implementers, we'd like to think of ways the developer community could benefit from an optimized framework similar to Yahoo's model

Which is where a Shared Content Delivery Network for client-side libraries might become interesting.

Such framework would allow site owners and developers to "register for the right to include a library on their site from the shared content delivery network URL". Let's face it, bandwidth and CDN infrastructures cost money, and access to such services should be contingent upon modest charges, tied to a Paypal or Google Checkout account specified during the registration process.

As the nature of those libraries is to be embedded within documents, the HTTP Referer (sic) should be sent with every request, at which point the "CDN Service" could verify that the originator site is actually registered. If it isn't, an HTTP 403 (Forbidden) response would be thrown. Each registered "Hit" would be tallied to a given account, and settled via Paypal or Google Checkout at the end of the month (or any other recurrence pattern). If no "Referer" header is present in the request, then a 403 would be thrown.

Beyond site owners, essentially "the consumers/implementers of enabling libraries", we need to consider Library Authors. How do we determine who gets to put their Library on the Shared CDN? A human-driven application process might be appropriate.

Looking at the tip of a likely large iceberg of custom functionality, such a framework definitely falls outside of the typical "file pushing" "out-of-the-box features" you might find on most commercial CDNs. However, it ought to be possible to leverage some of their more advanced features to build this custom framework. One of them might elect to build it in-house, or one of their clients might build a prototype.

Akamai comes to mind. They support deployment of custom J2EE apps onto their network. But there are others too. Most sport a large worldwide infrastructure for edge-caching and optimized content delivery, be it static or transient through efficient network routing.

Developing this framework would likely enable them to gain some revenue, catering to a more "Long-Tail" clientele, be regarded as innovating pioneers among developers potentially leading to larger accounts, and become de-facto "parts" of vital Web Infrastructure, thereby further cementing their longevity.

Meanwhile, web masters would likely save on bandwidth costs, web surfers would experience dramatically faster-loading sites, resulting in an overall more efficient Web.

Update 11/21/2007: See Ajaxian.com's entry on CacheFile.net. It looks promising! As of this writing, it doesn't offer CDN-backed delivery, or compression, and ought to provide for some sort of revenue stream (as offered above) beyond donations to at the very least cover its operating costs. Let's watch them closely as they evolve! :)

Wednesday, July 18, 2007

TrekEarth Photo Widget

Today we're releasing a photo widget to the TrekEarth community. This widget gives TE members the ability to create and publish viewers based on their personal photo collections. The quality of the imagery on TrekEarth is amazing. Check out this sample viewer based on photos by site founder, Adam Silverman.

Monday, July 9, 2007

Props to Facebook

"Facebook has an incredibly strong engineering team, but for some reason, it doesn’t yet get the recognition that Google gets as a technology company." That's a quote on Inside Facebook from fellow Stanford alum and former Google developer Justin Rosenstein, who left the Mountain View mothership to join Facebook. Facebook has certainly put up some very credible engineering stats, including the largest production memcached install, thier open source contributions like thrift, and the sustained uptime in spite of their enormous growth:




I just want to shout out to Justin and say: we see you. We get it. Google's not the only game in town for top engineers. Facebook's a great example. Internet Brands is another :)

Thursday, July 5, 2007

Internet Brands Welcomes Jelsoft!

James Limm, CEO of UK-based Jelfsoft Enterprises, makers of best-of-breed PHP-based community platform vBulletin, wrote a post yesterday to his customers about the acquisition of Jelsoft by Internet Brands.

Tuesday, June 19, 2007

iPhone: Apple's VoIP End-Game

Update 01/2008: SIP-based VoIP on the iPhone has been coming:


When Steve Jobs first demoed the iPhone in January 2007, he made it clear that reaching someone by typing their phone number onto a keypad was no-longer acceptable, albeit tolerated. Instead, he showed an Address Book interface that unifies the concept of a "Person" across all forms of communications on the iPhone, be it iChat, e-Mail, or a Normal Phone Call. Since version 10.2, Mac OS X has pushed Address Book integration across disparate Applications to unify the concept of a "Person", and iPhone simply builds upon the same philosophy.

While an "Address Book" seems as trivially simple a concept as it isn't new to anyone who's used a mobile phone within the last decade, seeing it executed "The Apple Way" in a larger synchronized ecosystem, helps paint a picture of possibilities that lie ahead.

Picture unlimited free calls over WiFi/IP without even having to "think about it", by simply picking a Person from your Address Book, and hitting "call" ... The same way you'd make a Normal Phone Call. All this powered behind-the-scenes by an outstandingly executed convergence of enabling technologies and open-protocols.

E-Mail addresses define a universal mechanism for a message to make its way into our INBOX, regardless of who provides the sender's E-Mail service. E-Mail functions on-top of open and interoperable standards. As such, there is vibrant competition to obtain "E-Mail Service".

Wouldn't it be nice to enjoy the same type of competitive and interoperable landscape when it comes to actually speaking to and video-conferencing with people? Beyond those silly phone numbers controlled by a handful of phone companies, what if we could pick from a myriad of competing "Providers" to obtain a global address allowing people to "call us" over the Internet, regardless of who their "Providers" are?

This is where SIP comes-in. It's been around and used on computers for a while, but it's been waiting for the right combination of enabling devices and software to truly break onto the handheld mainstream.

iPhone, with its WiFi capability, Address Book integration, and advanced operating system, is getting us one step closer.

A SIP Address looks just like an E-Mail address. A Person's SIP Address could easily be stored in the iPhone's Address Book. Apple could build SIP-capability right into the operating system, pre-configured with a number of existing SIP Providers for one-click setup, while still allowing for custom configuration, following a model very similar to E-Mail.

There are a few SIP Providers out there. But Apple could easily roll out its own SIP infrastructure as part of the .Mac framework, increasing their chances of providing a superior out-of-the-box experience, while promoting the .Mac brand to ... competitive usefulness. From here, the sky's the limit as to what Apple can do, leveraging iPhone's brand and near ubiquitous and still increasing WiFi penetration. Forget about fighting over 3G vs GSM. WiFi and IP are universal world-wide.

SIP call quality can be vastly superior -- think CD quality -- to a Normal Phone Call, as it strives to remain pure data exchange over the Internet Protocol on broadband connectivity, without ever getting wedged thru the codec limitations of any Normal Phone System.

When calling somebody, the iPhone could detect whether WiFi connectivity is available, and whether there is a SIP Address for the person i'm looking to call. If both these conditions are met, the iPhone could perform a "pure SIP Call" over the Internet, without ever touching the carrier's or any phone company's network. Blam. Free call. An icon might indicate to me that this call is a free, un-metered Voice-over-IP call.

Even if the person i'm trying to call doesn't have a SIP address, some SIP Providers offer the ability to relay calls to the Normal Phone System (aka PSTN in industry jargon), at substantial cost savings for unlimited plans, at which point all the iPhone has to care about, is that WiFi connectivity is present.

In terms of user experience, I wouldn't do anything differently: I'd pick the person I'm looking to call from my address book, and push the "call" button. What's behind-the-scenes is magic-sauce I, as a user, don't care about beyond vaguely knowing that the iPhone picked some Internet-WiFi thingie to save me money, and some icon is telling me I can keep talking for as long as I want.

Update 6/21/2007: As pointed out in comments below, WiFi+VoIP on "phones" aren't new, and carriers have somehow "dealt with it". Even if Apple further blurs the line between VoIP and Normal Phone Calls, there are still profit incentives and competitive market pressures that may very well entice AT&T to embrace VoIP.

Monday, June 18, 2007

Who Participates Online?

Cool infographic on BusinessWeek about who is doing what online. Click through for more detail :)

Friday, June 1, 2007

Microsoft Surface

Microsoft is currently shooting for a Winter 2007 release of Surface, according to the small red "Find It" link on their flash page. This looks like some truly amazing computing at the service of human interface.

Let's see if Apple picks-up this gauntlet at WWDC 2007.

They'll be demoing a preview at SIGGRAPH in San Diego, CA, August 5-9.

Via TAB.

Wednesday, May 23, 2007

Human Lessons in Software Engineering?

A Software Engineer, who's struggled a good chunk of his career with poor engineering and management teams, developing applications with Microsoft technologies, learns new technologies known to be used by the type of engineers he'd been longing to work with: Ruby on Rails development on Mac OS X, among others. After a 3-year journey, he finally finds and accepts a Director position at a company that embraces his passions.

Read the story here: Strange new worlds, and programming languages...: Good bye Microsoft; Pete has now left the building!

Despite the juicy and passionately controversial platform angle, the lesson i personally take from his story is more human, than technological. It shows me that it's at least as critically important for a passionate software engineer to thoroughly interview a prospective employer, especially a lesser-known one, as it is to interview them. Gauging a company's "mentality" and engineering practices can be difficult, at times an occult art. Add to that the reality and pressure of varying personal circumstances, posturing from interviewers throwing the big "marketplace leadership, cutting-edge blah blah" marketese speak, and one can see how easy it can be for a driven employee to step into some less than desirable work environments.

Had Peter had the benefit of his current experience earlier in his career, he might very-well have recognized poor companies, poor teams. He might for example have stayed away from any company that didn't throw engineers at his interview, grilling him on challenging and exciting problems, and soliciting questions back. This by far transcends technology and development platforms.

There are great developers, building great applications that do involve Microsoft technologies. Joel Spolsky's got a few at Fog Creek Software.

But the noise is obviously high. Great companies are few, and either hard to find or extremely hard to get into. And since nobody ever got fired for building a .Net or a Java application, there are a lot of "shops" out there, that are just that ... "shops", that crank out code, and attract "the Day Coders" Peter denounces.

But in the end, Peter's strategy strikes many of us who share his ideals as one reasonable approach among others: Exploring less common, less popular technologies exposes us to passionate communities of early adopters, out to question the status-quo, driven to research better, more efficient ways to build great software.

This strategy seems to have worked well for him.

Tuesday, May 15, 2007

Introducing IBDOM

We briefly mentioned what would become IBDOM when we wrote about working with DWR.

Tonight we're pleased to release IBDOM version 0.1 under an MIT License.

The source code is on SourceForge's svn repository.

While we've yet to release the API and Usage docs, linkage to our test page from the IBDOM Site, and a cursory look at ibdom.js should reveal plenty of useful information. Update: 05/16/2006: Initial stab at Using IBDOM is up.

What is it, you might ask?

It's a 20KB (uncompressed), narrowly-scoped JavaScript library aimed at "Wielding the Document Object Model with Ease and Standards-Compliance", and most notably ease the process of injecting JavaScript Data Objects, and Arrays of JS Objects into HTML or XML documents.

IBDOM should not only be useful when working with the Direct Web Remoting framework, but also any application where you find yourself trying to plug asynchronously-loaded data into an HTML document.

IBDOM coverage on Ajaxian.

Friday, May 11, 2007

Wikitravel: Interview with the Founders

On the heels of Wikitravel's Webby Award, founders Evan and Michele took the time to share with us the personal, human and technological adventures they continue to share with "Fifty thousand of their closest friends!".

Who Are You?

We're Evan Prodromou and Michele Ann Jenkins (Maj), the husband and wife team that founded Wikitravel.

Where are you from?

We're both originally from San Francisco, but we moved to Montreal in Canada in early 2003.

Where have you lived?

We've both lived in lots of places in the US as children -- Arizona, Texas, Hawaii, Pennsylvania, Connecticut, Ohio -- and as adults we've lived and worked in a few European cities -- Lisbon, Amsterdam, Geneva. Evan lived in Mexico City as a young teen, and Michele taught English in Nepal for a year.

How many languages do you speak?

How many do we speak, or how many do we speak _well_? Evan's got about six or seven (like Greek, Dutch, Japanese) under his belt can get into working order pretty quickly, but right now it's mostly French and English that he's good at. Maj has four or five, but she's also strongest in French and English.

Where have you traveled to?

There's a point in your travel career where you start counting where you _haven't_ travelled to. Seriously, though, we've both "done" most of Western and Central Europe, most of North America, Japan, Southeast Asia. Maj has traveled extensively in Nepal and India, and visited Morocco. Evan has been to Greece and Turkey as well as Australia.

Place we want to go: Buenos Aires, Sub-saharan Africa, New Zealand, China, Siberia. Probably our dream trip ahead of us is a circuit around the Black Sea.

How did the Wikitravel concept come to you?

We've told this story so often that it's acquired a bit of a halo of myth around it. Let's see if I can do it one more time.

We were backpacking around Thailand in the winter of 2002. On an island in the east, we arrived at a wharf and took a pickup-truck taxi called a songthaew out to where our guidebook said a great hotel would be. We had to walk the last 1/2 mile, and when we got to the hotel spot, we found an empty field -- maybe a few two-by-fours sticking in the ground, but if there had ever been a hotel here, it was long gone.

As we were hiking back to the road to try and find another guesthouse to stay at, we started talking about the experience. Sure, it was unpleasant for us that our guidebook had incorrect information. But what really made us mad was that nobody else could benefit from our experience.

We could send in an email or letter to the guidebook company, but they wouldn't send a writer out to check for another year or two. In that time, how many people would make the same mistake we made? One thousand? Ten thousand?

We could also post our experience on a forum or mailing list, but it's practically impossible to glean good information from that kind of linear medium. People ask the same questions over and over and over, and meaningful information gets lost in the noise.

Evan had had experience writing for Wikipedia the previous year, and he thought that maybe a wiki travel guide series would work. That way, those ten thousand travellers could update the guide themselves -- they wouldn't have to wait for that one writer to finally show up.

What does it take to make a webby-award-winning site?

Fifty thousand of your closest friends!

Seriously, we're very proud of the work we've done with Wikitravel, but we're also humbled by the outpouring of generosity from the Wikitravel community. We were lucky to have this idea at the time when we did; people were really ready for a project like this. Wikitravel is much bigger than we are.

If we had to think of something that we did to make Wikitravel successful, it would be that we kept our vision of the site persistently throughout its lifetime. We've changed a lot of things on Wikitravel in the last four years, but we've kept true to the core vision. It's a vision that's shared with those 50,000 people, and that's probably what's made the site so successful.

... with our spouse

That's been fun. It's not easy to work with your spouse, but we've had a deepened connection working on such a big project. We probably wouldn't know each other as well, or in the same way, if we hadn't worked on Wikitravel together.

Probably being married we're both more and less effective as a team. We tend to have the rest of our life bleed into our Wikitravel discussions ("You didn't rotate the server logs!" "Well you forgot to take the trash out!"), but we also understand each other so well that we can speak in shorthand about the site. ("Did you...?" "Twice." "Don't forget to..." "I made a cron job for it.")

What do you believe are some of the factors behind Wikitravel's exponential growth?

There are a lot of them that are working together. One of the biggest is the growth of Wikipedia. People around the world are becoming familiar with wikis and with Free Content through their knowledge of Wikipedia. That means that when people find out about a site like Wikitravel, they're usually already aware of what we're trying to do, generally.

Another is the growth of Free Content projects like Creative Commons. Wikitravel started using Creative Commons licenses pretty early, and our site has grown as the CC project has grown. They've given us a lot of support and attention, which is really appreciated.

Finally, I think people are just getting used to the idea of getting the information they need online. More and more, people are turning away from paper books, with their long editorial cycles and their ivory tower attitude, to the faster, more up-to-date, and more egalitarian mode of the Internet. Books are really, really convenient, but the way reference books are made today is broken, and the Internet is one way to fix them.

When and how did Wikitravel become your full time occupation? What were you doing as your day job before?

Well, that depends on how you define "full-time occupation". When we started Wikitravel, Michele was in grad school at McGill University, and Evan was working on his first novel. But it didn't take long before Michele was working about half-time on Wikitravel, and Evan was putting in a full 8 hour day on the site.

When our first baby (Amita June) was on the way, we knew we had to get serious about working outside the home. We'd both done a lot of Internet work when we lived in San Francisco, as programmers or as programming managers. So Evan took a job as an IT manager here in Montreal, and Maj worked as an information architect for an immigration Web site.

That was probably the toughest time, because we were working pretty demanding jobs, and then trying to fit in another full-time job --Wikitravel -- into our free time. It wasn't easy, especially since we were expecting a baby coming soon.

All this time Wikitravel was growing by leaps and bounds, and getting a lot of press (Wall Street Journal, New York Times), so our out-of-pocket costs went up, and the work we had to put into the site went up too.

So when Internet Brands approached us about Wikitravel, it was like someone throwing us a life preserver. We'd had offers to commercialize Wikitravel before, but we'd never had anyone talk to us who understood the site's mission and purpose. IB got what Wikitravel was about, supported the license, had a clear and fair plan for monetizing the site, and offered to pay us to keep making the site run.

What's your technical background?

Evan's been working as a programmer since the late 80s, when he did contract programming during college and scientific programming for research groups at Berkeley. He started working in Windows programming in 1990, and ended up working for Microsoft Corporation. In 1995 he switched to Web programming and worked on a lot of great Internet projects in San Francisco. He helped develop the security protocol SAML and then left SF when he was laid off from RSA Security. Since the mid-90s he's been an Open Source enthusiast and has been a member of a number of Open Source development teams, including Debian GNU/Linux.

Michele received an undergraduate degree in computational linguistics at UC Santa Cruz and worked on a few commercial packages before joining Salon Magazine in 1999. She helped Salon switch their content management system to a Perl-based system that eventually became the Open Source project Bricolage. She left San Francisco in 2000, travelled in South Asia, then worked for the World Health Organization in Geneva, Switzerland for 3 years. She did a master's degree in information.

Why did you pick MediaWiki?

That's a good question; we often wonder that ourselves. The main reason is probably the same as for most MediaWiki sites: it's the software that Wikipedia uses. At the time we started Wikitravel, we hadn't used a lot of other wiki software, so MediaWiki was pretty much the only thing we looked at.

At the time, MediaWiki didn't even have a name; it was just "the Wikipedia software". There were occasional stable code releases (just dated, no version numbers), but we usually ran the software that was in the CVS repository, just like Wikipedia. There weren't a lot of other MediaWiki sites around at that time; probably Memory Alpha and LQWiki were the other main ones running at the time.

What Media Wiki Hacks/Extensions have you implemented?

Probably the most important extension we've added has been a tool that allows users to define RDF data inside of pages. That's allowed our users to define structured data content, like geographical relationships, inside of tags in the Wikitravel pages, without us having to re-write the database schema to support it.

We've built a couple of cool extensions that use that RDF data. For example, our Breadcrumbs extension builds a hierarchical list of geographical areas -- cities inside regions inside countries inside continents. We use the geodata in the RDF tags to make maps, using a great mapping toolkit called Mapstraction. We link loosely-related pages to each other using RDF, and we use RDF to define "Docents" -- people who take responsibility for certain guides or destinations.

Besides that, we've tried to track some of the important identity protocols on the Internet, so that Wikitravellers' identity on sites beyond ours can be integrated into ours. So, we built an extension that supports the single-signon protocol OpenID, which is now in use on a number of different MediaWiki sites. We also built an extension to support MicroID, which lets users "claim" their user pages on services like ClaimID.

What are some of the tricks you've implemented for scaling Wikitravel?

In 2004 we developed a great tool for scaling Wikitravel. We call it "Cache404". It uses a feature of Apache that can call a script if a given file isn't found. We use this to build an pretty sneaky caching system that writes out HTML output to the server's file system, so that Apache can serve the plain HTML files next time the file is requested. It's really, really fast -- Apache is really good at serving flat files.

That's been the big thing that keeps Wikitravel running at such great efficiency. We're still running only one Web server and one database server! But we have to really stretch our resources to make that work. We use memcached, the memory-caching daemon developed by Danga Interactive, and we use eAccelerator, a great PHP accelerator. We try to squeeze as much speed out of PHP as we can... and we try to run PHP as seldom as possible.

What's the role of Microformats in Wikitravel?

A lot of what Wikitravellers do is put together lists of things: lists of hotels, lists of restaurants, lists of bars, lists of museums. We realized early on that having a consistent format for all these kinds of listings would make reading and finding them much easier for people. Once you knew that the address came first, then the phone number, then the hours, etc., you'd easily be able to find what you needed in each listing.

When we first heard of Microformats, we thought: hey, we're doing this already. We were in the process of formalizing our listings formats, so we took the extra step and made the HTML output conform to the hCard format. hCard is the microformat for "contact info", so people who have a microformats-enabled browser, like the Operator extension for Firefox, can save Wikitravel listings to their contact database, to help with their travel planning.

We use a few other uF's, like "geo" for our lat/long info. We're going to be tracking the microformats effort carefully; semantic output of HTML makes Wikitravel data available to a lot of people.

What will Wiktravel look like in 5 years? 10 years?

As people say in Southeast Asia, "Same same, but different." I think that what we'll see on the English Wikitravel site in five years will be a lot broader and deeper coverage. We've still got a lot of the world to cover -- we've got articles for about 13,500 places right now on our English version right now, and our best guess at an outer limit is around 100,000 articles. We also think that those articles will be full of more information than they are now.

We also hope to have more languages supported. We're at 16 languages right now, which is pretty impressive, but we still have a lot of important languages that aren't covered. I hope that five years from now we have thriving Russian, Arabic, Chinese Wikitravel versions, and dozens more. Each language that we add means more depth of information for all our language versions.

We also think that in the five year time frame, the Wikitravel Extra features will be mature and the data stored in Wikitravel Extra will provide a vital third dimension to Wikitravel proper. Wikitravel Extra is our effort to provide "opinionated" information (reviews, blogs, photos, links, social networking, trip planning) alongside the objective content in Wikitravel. Although it's in its infancy today, we think that in five years it should be a strong competitor with other personal-opinion travel sites.

Probably in five years we'll also be available on a lot more formats. We're already making plans for a Wikitravel line of printed books. Hopefully also in five years the mobile market will be stable enough, and mobile data will be cheap enough, that Wikitravel will be available on mobile devices like cell phones.

In ten years, we think that Wikitravel will be the basis of most travel information on and off the Web. That may sound presumptuous, but probably when the Open Directory was created, the idea that Yahoo!, Google, and most other Web portals would depend almost entirely on Open Directory for their listings would have sounded absurd.

What we think will stay the same will be Wikitravel's welcoming culture and its dedication to Free Content.




See Also:

Creative Commons Interview
Wikitravel FAQ

... and see what sticks.

Yesterday was a fine day.

A developer on my team prototyped a possible solution to questions and theories that had been previously raised to improve usability and increase conversion on one of our applications.

He got to run it by various stake-holders. After constructive feedback and suggestions, they concluded: "Sounds good, let's try it". The whole decision-making process took but a few minutes.

I see this scenario unfold on a daily basis across the entire engineering team, deeply engaged in working closely with Product Management and Business to promote the success of products we build.

While there are key final decision makers who "make the call" in absence of consensus, everybody here seems open to execute on fresh opinions and innovative ideas.

Wednesday, May 2, 2007

Wikitravel Wins the Webby!



Awesome news! Wikitravel won the much deserved 2007 Webby award in the Travel category.

Monday, April 23, 2007

Coda: Awesome Looking New Dev Tool

Wow! Panic, the Mac development shop will the coolest shopping cart system ever just announced what looks like a tremendously useful web development tool called Coda.



Lots of great features, but I especially like the built-in terminal, and how it nicely bundles all of your files, access credentials and other assets into a collection for each site. Just as a purely organization aid, I love that concept. Anyhow, my download is almost done, I have to go play with it!

Screenshot:

Thursday, April 12, 2007

Dave Coustan on Corporate Blogging at EarthLink

Jenerous.com has a good interview of Dave Coustan, former writer for HowStuffWorks.com and full-time blogger for EarthLink for the past 1 1/2 years.

This podcast's a good insight into "Corporate Blogging" at a company that serves over 5 million paying subscribers, with ~2,000 employees.

Dave's a great guy with a rare knack for effectively conveying techie concepts to humans.

With a few hundred employees, roughly 1% of world-wide web traffic going through one of the online destinations we serve, Internet Brands isn't the smallest fish in the pond, yet perhaps the "Biggest Internet Company the World's Never Heard of".

In certain ways it's "strategically neat" to stay below the radar. Yet it occasionally hurts us when it comes to "attracting developers". While not exactly a "corporate blog", ibbydev aims to bring an occasional geek-friendly glimpse into the stuff we do behind the scenes.

Tuesday, April 10, 2007

Wikitravel: Webby Award Nomination!

Congratulations to Evan, the Wikitravel.org community, and the Internet Brands Travel Team, for their Webby Award Nomination!

Friday, April 6, 2007

Recruiting: The Acronym Soup Phenomenon

Joel Spolsky has on many occasions mused about the challenges of attracting and retaining great developers. In my ever so irrelevant opinion, Joel's one of those few hiring managers who "get it". He's been in the coding trenches, and has a keen understanding of how developers think, and some strong strategies for attracting new talent.

In his "Sorting Resumes" article, Joel points out:
To top programmers, the most maddening thing about recruiters is their almost morbid fascination with keywords and buzzwords.


... and further explains:
The keywords section of a resume can’t be trusted much, anyway: every working programmer knows about these computer programs that filter resumes based on keywords, so they usually have a section of their resume containing every technology they have ever touched, solely to get through the filters.


Having endured the challenges of filling various developer positions in my team, I can most vividly relate to this problem, affectionately (less so these days) calling it the "Acronym Soup Phenomenon" (ASP).

Invariably, candidates feel the urge to fill their skills section with a slew of Acronyms, claiming in one fell swoop "Expert Knowledge of" areas ranging from front-end Document Authoring and User Interface Engineering, to Database Administration, Systems and Network Engineering.

While Great Developers with amazing breadth and depth of experience are out there, Joel's articles and empirical evidence teach us that odds are likely they didn't send you the pile of Acronym-ladden resumes you're looking at, assuming they even have an updated resume.

As an example, seeing "XML, XSLT, and XPath" in a candidate's Acronym Soup, I would look forward to discussing the potential merits or shortcomings of an XML Database with an XPath or XQuery API for storing and exposing syndicated XML content to applications ... Only to find them shying away from writing on the whiteboard a sample expression for a "foo" element child of a "bar" element.

AJAX is a hot acronym that shows-up in everyone's soup, and one would think most "AJAX developers" coming-in for a User Interface Engineer position would have by now read Jesse James Garrett's piece or acquired a little more knowledge and/or practical experience than copying script.aculo.us code samples.

When asking candidates about what they think AJAX is, 30% can indeed spell out what the acronym means, and one candidate so-far was able to give a slightly interesting description of the various technologies and schools of thoughts that make it up, while showing cursory knowledge of various DOM methods or the XmlHttpRequest object.

Some candidates might casually mention they "slipped" AJAX in the skills section because they're interested in learning it. That's how hot AJAX is.

As a result of the Acronym Soup Phenomenon, a resume with less buzzwords focusing on a specific discipline tied to the open position, is more likely to get attention.

When interviewing at any decent company, it's fair to expect candidates will be asked to demonstrate the appropriate level of expertise to their prospective colleagues.

Upon passing the screening process with a rich Acronym Soup, bring patience to the interview and we'll provide water, food and a sleeping bag: we'll want to know how thick it is. Not because we're sadists with nothing better to do, but rather because we run a lot of applications, both internally and customer-facing, serving a myriad of business needs on a wide variety of code bases: PHP on a LAMP stack, ASP .Net, Java/Servlet Container, Java/EJB Container, Spring+Hibernate, DWR. Breadth of experience with adequate depth won't go wasted.

And yes ... we "do" AJAX.

Wednesday, April 4, 2007

Reviewing Google Desktop for Mac

Google today just released Google Desktop for Mac.

TheAppleBlog.com has a good review of it.

QuickSilver and Spotlight users may wonder where Google Desktop fits-in, and from using all 3 apps, here are some guesses:

- Spotlight is a simple view into Mac OS X's real-time file-system-based indexing technology: Every time you save something on your Mac, Mac OS X detects this "event", and tells Spotlight to "index" or "re-index" the file in real-time. This is why it's possible to create a text file anywhere on your hard drive, type some text in it, save it ... and searching for that text in spotlight instantly reveals the file you just created. More importantly though, Spotlight will find just about any piece of information stored in just about any document on your hard drive, whether it lives in the file name, or in the file itself.

- QuickSilver leverages Spotlight to near-instantly keep track of installed Applications, but also various types of documents fitting certain categories. From here though, QuickSilver is far more about "Acting Without Thinking" than it is about merely "Finding Stuff":
In the end, Quicksilver has one very important effect: the effort of frequent tasks fades into the background and you are able to act without thinking. After an adaptation period, Quicksilver becomes an extension of yourself; the process fades away leaving only the results.
In this sense, QuickSilver and Spotlight are very complementary technologies.

- Google Desktop for Mac, however, appears to be reindexing the entire hard drive without so-much relying on the "live index" being kept by Spotlight. This isn't entirely surprising as Google may be leveraging custom indexing and searching algorithms to surface more relevant results than Spotlight otherwise would.

As a good neighbor though, the Google Desktop Preferences pane indicates that Google Desktop will obey all no-indexing privacy directives specified in Spotlight, which is a nice integration touch.

It will be interesting to see whether Google Desktop will pick-up "real-time" changes to the file system the same way Spotlight does.

Similarly to its Windows counterpart, it runs a web service that appears to solely bind to localhost on port 7468, which is reassuring. The last thing you want is for someone on your network to query your hard drive. Trying to connect to port 7468 to my LAN IP from another machine on the network did, as I was hoping, refuse the connection.

If it does deliver on speed and results relevance, Google Desktop might if anything be more of a "replacement" for using Spotlight for day-to-day file searches, while providing a more elegant integration framework of Mac OS X with the overall Google Ecosystem.

Thursday, March 29, 2007

Barcamp Los Angeles 3 Wrap-Up

Joe stopped-by for a few hours on Saturday and taped his business card to our sponsor logo, reminding attendees that we are, indeed, hiring :)

Sunday morning started with a lavish breakfast, and Heather giving a very interesting demo of ooVoo, with a colleague of hers travelling through Hong Kong, telling us all about his exciting Tuxedo Travels adventures.

After having spent time in the morning putting together some sample demo code, I gave the 10:30am presentation on "DWR, XHTML, JS ...". The first question I asked of the audience was whether anyone had ever heard of "Internet Brands", and much to my surprise, a few hands actually rose.

While we're best-known for CarsDirect.com, we also operate many exciting destinations such as Wikitravel, World66.com, SlowTrav.com for travelers interested in less "touristy" experiences, Loan.com which provides education, advocacy and a vibrant marketplace for consumers seeking Ethical Lenders, DoitYourself.com for home improvement and home repairs, RealEstateABC and its best-of-breed Home Valuation tool, WikiCars.org for all things automotive, autos.com for car research.

I had a Firefox window opened with tabs for all these sites, and after quickly flipping thru them, I went on to our latest project: the complete redesign, rebuild, re-architecture of CarsDirect.com's Used Cars Marketplace, and the challenges associated with developing an advanced, dynamic User Interface, while keeping it maintainable.

The audience asked very pertinent questions about some of the issues and challenges we're still addressing, such as the ability to bookmark various searches as we refine results, and our current inability to get any SEO juice out of the search results page. Many expressed interest in being able to "subscribe" to vehicle searches via RSS. Doug had actually implemented this feature, but we still need to surface it and fix a few cosmetic issues. But if you're curious, here's a feed for Used Cars in Hermosa Beach, CA, with a power convertible top, and manual transmission.

We've got ... a few ideas for RSS, and possibly more advanced feeds, allowing for more exciting applications ... and not just ones built by us.

Beyond user interface, attendees realizing that each click on the results page triggered an asynchronous call, were curious to know how we returned data so fast. When i joined this project, Doug and his team had already built a proof of concept they called "wireframe" that truly showcased this "instant satisfaction" we get from interacting with the interface. It felt ... impressive, leading to a major "Neo-watches-Morpheus-jump-to-the-next-building ... Wow" moment when he explained how they'd architected the app to make this 1) possible, and most importantly 2) scaleable.

But I digress. The rest of the presentation covered the sample code I'd put-together, showing how we can use "cloned templates with mapped data fields" to facilitate the process of injecting data contained in JavaScript Objects into a document, enabling you to keep the Markup away from the Scripting. So-far we still have management's approval to release this framework into a library, but we do need to find time to re-factor a few things.

After the presentation, I got to meet, chat, and reconnect with a number of interesting people:

Jason Fields from fellow IdeaLab company Snap.com was passing around stickers and reminding us about their very cool "Snap Preview Anywhere™" feature.

Chris Gagne showed me Student of Fortune, and later-on gave a talk on leveraging Open-Source concepts to solve real-world problems.

Zach Greenberger told me all about Valuewiki.com. It's a very interesting concept: They're providing a community-driven securities research site. Being a Wiki, user-contributed content can, to some extent, "self-correct" itself, resulting in much-higher quality information than what you might find on a typical "stock message board" such as Yahoo Finance. They also run "Free the Market" blog, with insightful posts covering the Wiki and Business Worlds. Their PayPerPost.com interview with Dan Rua even got slashdotted yesterday!

Belkin, a fellow sponsor, showed-off a preview of their very cool "Networked USB Hub", to be priced around $120 if I remember correctly. I would have bought one right-away had it been 1) available for sale, 2) had a Mac OS X driver ready. It turns out they're a couple of months away from releasing it, and they do have a Mac OS X driver ready, they just need to get it out of the "beta" stage.

Crystal Williams demoed practical applications of Microformats to a packed audience.

Karl Roth gave us a presentation on his Solar Illumination system, to cost-effectively bring sunlight into just about any building, to complement existing electric lighting with dramatic energy savings during daylight hours. This cost-effective solution would pay for itself in under 2 years in energy savings alone, not taking into account extra revenue resulting from increased productivity from dramatically improved working conditions.