Google AppEngine Thoughts 1

Posted by Toby Tue, 08 Apr 2008 13:54:00 GMT

I just read a little bit about the recent announcement of Google AppEngine and I think its a pretty good service overall for certain kinds of Web applications. Unlike the rumors that were floating about prior to its announcement, its not a competitor to AWS directly, nor even a competitor to Ning as some have also claimed. To me, it seems more directly in competition with Heroku if such a thing could even be said. It is clearly based on Bigtable, though, so that part of the rumor appears to have been true.

Being so simple has some advantages and presents some interesting constraints. Because you don’t have root and can’t run the “box” yourself, you’re forced to think simply about the app itself. This seems as if it would be quite a welcome constraint for a lot of Web developers who don’t care about running a network. But the constraint really serves to enable Google-style scalability at its heart. As well, the use of CGI allows Google to run your code wherever they deem it best at the moment underneath, without you having to care about that sort of thing. This is something they already do quite well.

As well, the lack of a traditional RDBMS is something that obviously works well in a shared-nothing environment and makes a lot of sense since Google’s infrastructure is already based on such. This gives real credence to the idea that the RDBMS isn’t the be-all-end-all and will introduce a lot of developers out there to this new way of thinking.

A few things I don’t like:

  • Users of an AppEngine app must login with Google Accounts
  • You do have to upload your Python source to Google and this leads to obvious privacy and IP concerns
  • No recurring job scheduling (essential for lots of different kinds of apps)

One more obvious thing is that an application built on AppEngine is one that is much, much easier for Google to acquire than one that is not. Don’t underestimate that piece of it, as its likely this service will be a loss-leader for Google.

Finally, I think this announcement will be very good for Python. As AppEngine only supports Python right now and for the foreseeable future, anyone interested in AppEngine will have to learn some Python. This will shed some more light on Python for people who might not have otherwise given it a try.

UPDATE: Apparently, others agree about the acquisition potential of apps on AppEngine. Good stuff.

Philly Emerging Tech 2008 Wrapup

Posted by Toby Fri, 28 Mar 2008 01:07:00 GMT

Wow, this year’s ETE was even better than last year. A bigger crowd, a better venue and great speakers made this year the best yet. I can’t wait for next years! There’s a lot of cool stuff going on around the Philly area that you’d never know if it weren’t for events like this. E.g. did you know MapQuest is located in Lancaster, PA? I had no idea until yesterday.

In other conference news, my talk yesterday went pretty well. People seemed to really enjoy it and the room was full. You can view my talk slides on Hadoop on my talks page along with my other slides from previous talks (including last year’s ETE talk on Comet).

Thanks to Chariot for putting on another excellent event, in particular Tracey Welson-Rossman and her merry band of awesome facilitators! Keep it going for next year :)

I'm Speaking at Philly ETech 2008

Posted by Toby Wed, 26 Mar 2008 11:55:00 GMT

I’m speaking about Hadoop today at 4PM EDT at the Philadelphia Emerging Technologies for the Enterprise conference. Its located at Drexel University. Come on down if you can make it; the speaker list is great and I’m sure it will be a great time for all!

Yahoo! Moved to Hadoop for Production Search 1

Posted by Toby Tue, 19 Feb 2008 20:57:00 GMT

Today a real validation for the Hadoop effort came in the form of Yahoo! announcing that their production search is now running on Hadoop. This should go a long way to allaying others’ concerns about Hadoop’s speed, stability or scalability. Cheers, Yahoo!: You’ve done some excellent work!

The Rock and Why Its Bad for Your Mind 2

Posted by Toby Wed, 22 Aug 2007 17:49:00 GMT

It would seem that Sun’s new chip, dubbed the Rock, will support hardware extentions for software transactional memories.

I tend to think of this as not such great news.

Now, I’m all for helping programmers in their work (being one myself) and making it easier to work with the parallel systems coming down the pike. However, I don’t see hardware STM support as a big step forward.

I do realize that people don’t change overnight and that all of that legacy code can’t be rewritten in a flash to take advantage of the newer, better ways always being introduced. I also realize that Sun is making the right business move in allowing their customers to “port” their code to these more parallel machines with the minimum of effort. That’s all well and good and I applaud them being the first to market on this, both from a business and a technical perspective. I’m sure it wasn’t an easy task, what they did. But making a big deal of this advance belies the ultimate truths involved here.

Its no secret that I’m a fan of message-passing concurrency in general and Erlang in specific. I believe that this is the way to go for the industry going forward. You are free to disagree, but I feel that MPC is a safer and cleaner way to go forward for the future of the industry. It removes a lot of the issues that are still problems with STM, just hidden so that only the VM and compiler guys have to worry about them. (at least initially) Not that it doesn’t introduce some of its own, but I find these to be preferable to the alternatives.

However, in the end, using STM doesn’t force programmers to write concurrent code. They are not thinking concurrently, only thinking about what to mark as atomic. This is still sequential programming. People using SQL have been thinking this way for years and I’d hardly call it concurrency-oriented. Raise your hand if you’ve worked on a project where someone kills database performance with some poorly-written transactions. All hands up?

We’re just going to have to go through this again later when the STM abstraction starts falling apart. Optimistic concurrency (the underlying premise of STMs) is only as good as the care taken to prevent lots of and/or many long transactions. Without this care, performance will suffer both in speed and concurrency. It is, after all, optimistic. Optimism just doesn’t seem to scale.

You can argue that the best programmers will do this beautifully, and I agree. But how many of us get to work with the best programmers? How many more of us have to deal with a bunch of code written by people who couldn’t wait to jump in their cars at five o’clock? To me, message-passing concurrency solves this problem in a more direct (if more initially brutal) method: it just plain forces the programmer to think concurrently. That’s the real key. Don’t mask it, make them face it.

STMs are essentially a stop-gap measure to stem the pain coming from the rise of multicore. But, like any drug, the effects will wear off and that pain will be felt. By using STMs, you’re borrowing from the future. Personally, I would think it better to just rip off the band-aid now and start learning and using the message-passing concurrency mindset today.