Powerset launches itself directly into the toilet 3
Stop me if you’ve heard this one: a “semi-stealth” startup spends 2.5 years, hires ~200 people and gives them all new MacBook Pros, takes untold millions in multiple rounds of VC funding, makes a bunch of noise about being the next generation of search and then launches with a search engine that basically re-indexes Wikipedia and not much else. ROFLOL, right?
Not if you’re Powerset its not. This is exactly what they’ve done.
As far as I can tell, their burn rate can’t be any less than ~$21MM/year. This is counting 200 people at an average $100K/year salary and 1,000 small EC2 instances for a year. I’m pretty sure they are using more EC2 instances and lots of their people are making more than $100K/yr in downtown San Francisco, but I’m being conservative here. This doesn’t seem that bad until you realize that their revenue is $0/forever thus far. Conservatively, they’ve probably burned $40MM already building the product as it currently stands. Geez, that’s a lot. Google got something useful up and running for $100K.
And where is that product? Discussing it on Twitter today, my boy Cliff Moon (a Powerset engineer), sent me a link to show how good the results were. Here is that link:
http://skitch.com/moonpolysoft/mjx5/who-shot-the-man-who-shot-jfk-powerset
That’s a pretty awesome result, I have to admit. Direct and to the point and right with the info you’d need to win that bar bet. However, I replied back with this link, pertinent to some work I’m doing currently (and something that Wikipedia definitely has an article on):
http://www.powerset.com/explore/pset?q=mondrian+database&x=0&y=0
Not so much.
Now, this little exchange proves nothing. The real thing that struck me was that every result I searched for was basically a re-ambiguated list of Wikipedia results. Powerset claims to be using Wikipedia and Freebase as its base data for now so that makes some sense. However, I took a look at Freebase and it appears that most of Freebase is Wikipedia data, too! Thus, one could (semi)facetiously claim that the powerset of Powerset is {Wikipedia}. This is not great for them. I don’t really switch search engines for this little incentive. Why would I want to use Powerset when Wikipedia already has its own search engine? If that fails I can always hit Google and pare down the results to wikipedia.org, which I’m sure is what a lot of power-Wikipedia users already do.
The larger question is whether or not people would switch to Powerset. If it were an order of magnitude better, yeah, I think they would. However, the Powerset I’m seeing is nowhere near as robust or helpful as Google is today and those guys aren’t exactly standing still over there in Mountain View.
As well, I would question whether or not the question-based interface is as useful for everyday searching as the keyword-based interface of the current crop of search engines. Perhaps I’m just ingrained to that method by now, but the question-based interface seems clunkier and is definitely slower and less flexible than the keyword-based interface. There’s just more ways to query an engine based on a set of keywords than there is if you have to formulate a question to do the same job. I can see how a question-based interface would be superior in certain cases, but in general? Doesn’t seem so to me.
On the advertiser side, the question-based interface brings problems there, too. Today’s search engines allow engines to buy keywords in conjunction with the ads they want to display when said keywords are queried. How is Powerset to build a CPC engine on top of a question-based engine? Are advertisers expected to have to guess at the questions Powerset’s searchers are likely to query? That seems to me to be an order of magnitude more difficult than today’s keyword-guessing-game, which is already hard enough for the advertisers as it is.
Most of the “questions” I ask to search engines don’t have one paragraph answers. When I’m researching, I want to quickly skim a bunch of sites that relate to my general query topic and then get down deeper and deeper as I learn more. Only once I’ve done that might I have some “questions” that I could properly formulate for consumption by Powerset. Am I to assume that Powerset would have me use Google or MSN for the first 80% or more of my researching? I hope that’s not their goal or their VCs are taking an acid bath right now.
I can’t remember being this underwhelmed from such an overhyped product before. Powerset really let me down. If this is the next generation of search, I’m sticking with the current generation, thank you very much. As the eminent Internet sage Ted Dziuba would say: FAIL. Google’s probably throwing a victory party in the volleyball courts right now. I really hope that Powerset gets its act together and makes me look like an idiot for posting this, but after today, its looking to me like that will be a very tough proposition, indeed.
37signals and Divergent Reality 2
I was just reading about In/Out, 37signals’ internal Twitter clone over on their blog and it struck me how they sometimes mutilate context in their writings. First of all, DHH knows the Twitter guys well. Second of all, they didn’t even mention that it was inspired by Twitter and even went so far as to refute this in the comments. I guess a place that’s know for its “creativity” has to keep up the appearances of being innovative…
UPDATE: Apparently, as the comments state, I am wrong about this. My bad. Nevermind this post.
Is AppEngine Python's Rails? 4
I was thinking about AppEngine some more today and it occurred to me that not only could AppEngine be responsible for a lot of people learning/using Python, but it very well might be Python’s answer to Rails. Its so-called “killer app”, if you will.
Up until this point, Python has been plagued with multiple, competing Web frameworks all taking some mindshare and there really hasn’t been a strong rallying point in the Python Web community like Rails. It appears that Django has been winning out in the blogosphere lately, but its nothing like Rails’ devout following in Ruby-land.
However, with AppEngine, Google does Rails one better: instead of just making it easy to code your app, they make it just as easy as Rails to code and dirt simple to deploy and reduce the operation maintenance to near zero. The need for things like ActiveRecord and migrations is pretty reduced in the AppEngine environment, as is all but a tiny knowledge of SQL (called GQL in that realm). That’s really, really attractive if you’re a Web consultancy shop that’s looking to turn over clients as fast as possible or a side project with a mandate for speed, quality and low cost. To me, that seems like it would be worth learning a little Python for.
AppEngine is pretty clearly aimed at Facebook’s F8 platform but it could end up hitting Python with a major boost in popularity as an aside. I bet Guido is smiling all the way to the bank on this one…
Google AppEngine Thoughts 1
I just read a little bit about the recent announcement of Google AppEngine and I think its a pretty good service overall for certain kinds of Web applications. Unlike the rumors that were floating about prior to its announcement, its not a competitor to AWS directly, nor even a competitor to Ning as some have also claimed. To me, it seems more directly in competition with Heroku if such a thing could even be said. It is clearly based on Bigtable, though, so that part of the rumor appears to have been true.
Being so simple has some advantages and presents some interesting constraints. Because you don’t have root and can’t run the “box” yourself, you’re forced to think simply about the app itself. This seems as if it would be quite a welcome constraint for a lot of Web developers who don’t care about running a network. But the constraint really serves to enable Google-style scalability at its heart. As well, the use of CGI allows Google to run your code wherever they deem it best at the moment underneath, without you having to care about that sort of thing. This is something they already do quite well.
As well, the lack of a traditional RDBMS is something that obviously works well in a shared-nothing environment and makes a lot of sense since Google’s infrastructure is already based on such. This gives real credence to the idea that the RDBMS isn’t the be-all-end-all and will introduce a lot of developers out there to this new way of thinking.
A few things I don’t like:
- Users of an AppEngine app must login with Google Accounts
- You do have to upload your Python source to Google and this leads to obvious privacy and IP concerns
- No recurring job scheduling (essential for lots of different kinds of apps)
One more obvious thing is that an application built on AppEngine is one that is much, much easier for Google to acquire than one that is not. Don’t underestimate that piece of it, as its likely this service will be a loss-leader for Google.
Finally, I think this announcement will be very good for Python. As AppEngine only supports Python right now and for the foreseeable future, anyone interested in AppEngine will have to learn some Python. This will shed some more light on Python for people who might not have otherwise given it a try.
UPDATE: Apparently, others agree about the acquisition potential of apps on AppEngine. Good stuff.
Twitter and me 1
I’ve been using Twitter for some time now and I post to Twitter orders of magnitude more than I post to this blog. With my blog, I don’t feel like I should post something unless I have something interesting or funny to say that others would like to spend some time reading. However, Twitter makes it easy to just post whatever I’m doing or thinking at any random time, reply to other’s conversations, keep up on what’s going on with friends, etc.
I really like the Twitter model, too, in that there are some very interesting constraints:
- messages are 140 characters or less
- Twitter auto-shortens links with tinyurl
- no embedded audio/video (I’m looking at you, Pownce)
- very simple network model (follow and be followed)
Twitter’s been the whipping post of the Internet for the past year because they had some well-known scaling issues and this was incorrectly blamed on their underlying Web framework, Ruby on Rails. Twitter’s got some things going for it as far as an interesting example of scaling, though, in that the model is so simple. I’ve been playing around in my mind with designing a Twitter clone in Erlang or Stackless with no RDBMS just as a mental exercise. More on that if it ever becomes concrete-er-ish.
In any case, my blog isn’t going to die for a while but if you really want to keep up with every little thing with me you should follow me on Twitter ;-)
Running on Thin 1
After moving all my sites to Slicehost, I figured I could also now experiment with the backend of this blog, too. So, instead of a Mongrel-backed Rails app, I am backing Typo up with the Thin webserver. Thin is supposed to be faster and more concurrent than Mongrel in any case, even the evented Mongrel from Swiftcore. Here’s hoping things go well!
Yahoo! Moved to Hadoop for Production Search 1
Today a real validation for the Hadoop effort came in the form of Yahoo! announcing that their production search is now running on Hadoop. This should go a long way to allaying others’ concerns about Hadoop’s speed, stability or scalability. Cheers, Yahoo!: You’ve done some excellent work!
Slicehost, FTW 2
A couple hours ago I moved this blog, my main website and the PhillyLambda homepage over to the Slicehost service. I’ve been a Dreamhost customer since 2002 and back then they were great and cheap. They were great for a while after that, and then they were OK but now they are pretty bad. Routinely, when I SSH into Dreamhost, I see load averages in the double digits (one time it was 135+). Every time I post to this blog, I have to hit Publish and then hit back about 5 times to get it to really commit because the FastCGI keeps snapping between Apache and Rails and throwing up a 500 error. I was getting pretty tired of it all.
So, for just a little bit more money I get my own dedicated VPS where I can run whatever I want. I’d heard really good things about Slicehost for a while so I decided that when I got the time, I’d move my sites over there. I also am using DNS services from Nettica as they were very good to me while I was at Commerce360. Here’s hoping that when I press Publish this time it just does it! ;)
Definitive Proof of the Blub Paradox...
...has arrived in the form of a blog post by one Lawrence Kesteloot. In this post, he attempts to show that languages are distinguished as either “production” or “toy” languages (troll terms if I’ve ever heard them) and then proceeds to slam SICP and Lisp and generally anything he doesn’t understand.
My biggest problem with this article is not the fact that someone with a masters degree in computer science doesn’t know what a fixed point is, but rather the fact that it nearly proves the Blub paradox to be true all by itself. Clearly, this author has become “institutionalized” in the Java mindset and construes brevity of expression with inscrutability. He maligns powerful languages features and elegance in favor of “understandability” by the average programmer and then implicitly marks himself as such by misconstruing boilerplate code that an IDE generates for him with readability. Ever read the code generated by Axis? No… why would you? That’s the least readable Java can be, in my estimation, but the point is that I don’t have to read it. Lisp excels at that kind of code generation task. The author seems to believe that thinking about the code you’re working with is a bad thing and that if it takes any effort at all to understand it must be bad. This thing we’re in is an art, people, not a science.
The speculation about Yahoo!’s rewrite of Viaweb also does not help his case. Yahoo! trucks in average programmers galore. Surely, there are many good and great ones there, too, just not in as great a number. As such, its not unbelievable that they could not find Lisp programmers: you typically have to be looking for something before you find it. Also, why would a Lisp guy want to work at Yahoo!? The coolest project they have going right now is written in Java. The kind of guy who hacks Lisp is unlikely to be pining for a job at Microsoft or IBM or even Yahoo! these days, nes pas? I do know some excellent “production” people (around Philly, even) that hack on Lisp and Scheme in production, but they appear to be the exception rather than the rule. Finally, I’m assuming that the author has never heard the fact that lots of the big financials choose a very high-level functional language when they need to make sure their systems are right. I bet this is partially so they can attract the very best people to those positions, as well.
I suppose the author might have also taken Amazon’s choice to dump Lisp as a sign that Lisp is not production. (assuming he was aware that it was, in fact, originally built in Lisp and C) I, instead, would take that as an indication that Amazon outgrew the ability to employ only the best people they could find. Business growth has a way of doing that to you. The distribution of talent and passion appear to be two somewhat uncorrelated power law curves and finding people at the top of both curves simultaneously is a task that will make anyone’s hair fall out and liver get harder. Plus, there are many business realities that make Blub languages more attractive for any number of reasons: talent pool, employee turnover, ramp-up time mitigation, cultural integration, etc. I write this article after having just dumped Ruby on Rails in favor of Java at my startup for some of these very reasons.
To this author I would only say that a language does not make something “production”. People do. Throw Alan Cox , Ingo Molnar and Guy Steele in a room and I bet they could write a better mousetrap in Brainf*ck if they were excited enough about it. Its never about the language. Its about the people. Full stop.
"Sometimes you need more than one programming language"
Someone had said the above statement to me today and it stuck in my mind for some reason. I was thinking about it tonight and something occurred to me.
I’ve heard a bunch of variations on this statement before today as I’m sure we all have. Often, when I hear this statement its coming from people who would like to be able (or allowed) to use Ruby in their work projects. These places are invariably ones in which Java is the primary language and any others are either frowned upon or outright barred. The management at these places, either technical or not, typically characterize this decision as smart given the large market for engineers that know Java, the availability of quality libraries for many purposes and the inherent generality of the language. All of these are certainly true and are pretty good business reasons to use Java.
These arguments ignore the ease of use of Ruby and the speed of development one can achieve with it. They are also a form of premature optimization, as well, although in this case of business interests, not source code. Most of these shops are doing 3-tier Web applications and, as such, they are missing out if they skip over Rails because it doesn’t run on Java. (it does, but whatever)
However, it was the statement that really hit. People saying that are indicating that different languages have different strengths. Even though Java now has regular expression capabilities, you’d still want to use Perl or Ruby for a big text processing job given how many built-in facilities those languages have for that task. You’d want to use Erlang for very-long-lived server applications over Java because it was tailor-made for that purpose and Java was not. All of these points lend themselves towards a heterogeneous language environment, using a language for what its good for and not for what its not.
I wonder, though: when people tell me this, they are usually rationalizing the use of Ruby with this statement. Not “rationalizing” in that they shouldn’t be using Ruby, but rather they are seeking to get leverage for Ruby so that it might one day be in the position that Java is now. I can certainly sympathize with this; Ruby is a sweet language. Alas, you can’t get there from here today, though. The main Ruby interpreter has major problems with memory and stability and its successors are still in their nascent stages. The runtime situation is just not that good so you can’t drop Java for everything yet if you need speed/low-memory in places.
How many of the people who have said this, though, would then reverse their position and advocate for Ruby being the only language for “maintainability” or “ease of introduction to the junior guys” reasons once a strong Ruby VM was available? And how many people said the same thing in reference to Java when C and C++ were king? It seems to me that these language trends are cyclical. The new hotness challenges the old-and-busted and eventually wins, only to become the new old-and-busted.
The irony here is that the statement is true in an absolute sense. One should use languages for the things they are good for and find different ones for things they are not. To not do so is to arbitrarily shorten not only your toolset but your very range of thought. (big up, Sapir-Whorf) Attempting to use one language for all your programming needs leads to ridiculous situations like the Kingdom of Nouns phenomenon. The other strangeness with this statement is that the people who say it generally never mention the really out-there languages like Lisp, Ocaml, Prolog or Smalltalk that are orders of magnitude better at certain things than more mainstream languages. Personally, I just hope people remember the irony when 10 years have passed and all new development at JPMorgan is in Ruby running on a Gemstone derived VM.
Older posts: 1 2