"Sometimes you need more than one programming language"
Someone had said the above statement to me today and it stuck in my mind for some reason. I was thinking about it tonight and something occurred to me.
I’ve heard a bunch of variations on this statement before today as I’m sure we all have. Often, when I hear this statement its coming from people who would like to be able (or allowed) to use Ruby in their work projects. These places are invariably ones in which Java is the primary language and any others are either frowned upon or outright barred. The management at these places, either technical or not, typically characterize this decision as smart given the large market for engineers that know Java, the availability of quality libraries for many purposes and the inherent generality of the language. All of these are certainly true and are pretty good business reasons to use Java.
These arguments ignore the ease of use of Ruby and the speed of development one can achieve with it. They are also a form of premature optimization, as well, although in this case of business interests, not source code. Most of these shops are doing 3-tier Web applications and, as such, they are missing out if they skip over Rails because it doesn’t run on Java. (it does, but whatever)
However, it was the statement that really hit. People saying that are indicating that different languages have different strengths. Even though Java now has regular expression capabilities, you’d still want to use Perl or Ruby for a big text processing job given how many built-in facilities those languages have for that task. You’d want to use Erlang for very-long-lived server applications over Java because it was tailor-made for that purpose and Java was not. All of these points lend themselves towards a heterogeneous language environment, using a language for what its good for and not for what its not.
I wonder, though: when people tell me this, they are usually rationalizing the use of Ruby with this statement. Not “rationalizing” in that they shouldn’t be using Ruby, but rather they are seeking to get leverage for Ruby so that it might one day be in the position that Java is now. I can certainly sympathize with this; Ruby is a sweet language. Alas, you can’t get there from here today, though. The main Ruby interpreter has major problems with memory and stability and its successors are still in their nascent stages. The runtime situation is just not that good so you can’t drop Java for everything yet if you need speed/low-memory in places.
How many of the people who have said this, though, would then reverse their position and advocate for Ruby being the only language for “maintainability” or “ease of introduction to the junior guys” reasons once a strong Ruby VM was available? And how many people said the same thing in reference to Java when C and C++ were king? It seems to me that these language trends are cyclical. The new hotness challenges the old-and-busted and eventually wins, only to become the new old-and-busted.
The irony here is that the statement is true in an absolute sense. One should use languages for the things they are good for and find different ones for things they are not. To not do so is to arbitrarily shorten not only your toolset but your very range of thought. (big up, Sapir-Whorf) Attempting to use one language for all your programming needs leads to ridiculous situations like the Kingdom of Nouns phenomenon. The other strangeness with this statement is that the people who say it generally never mention the really out-there languages like Lisp, Ocaml, Prolog or Smalltalk that are orders of magnitude better at certain things than more mainstream languages. Personally, I just hope people remember the irony when 10 years have passed and all new development at JPMorgan is in Ruby running on a Gemstone derived VM.
Just Some Idle Anti-Spam Thoughts 1
Earlier this year, I came off a five year stint in the anti-spam industry. As such, I tend to try to keep up on the latest in that world. Tonight, I was talking with an acquaintance of mine and the topic of spam came up. I was telling him about the Storm Worm and the relatively new wave of stock pump-and-dump spams with professional-looking PDF attachments as payload. He’s an options trader at SIG and he had a different take on that kind of spam: he told me he had looked at getting into that market! Not spam, mind you, but rather shorting the stocks that are featured in these pump-and-dump spams. Obviously, this is how the originating spammers themselves make money but I never figured a legitimate outfit to get involved with stuff like that (they decided not to, he said). Now, I don’t know if this is true, but he also told me that there are entire hedge funds that just watch the mail streams and look for these pump-and-dumps to make short moves on. I found this fascinating for some reason.
This was on my mind as I was driving home and I wandered a bit in my thinking. I started to think about the work I had done and how I really enjoyed the anti-spam challenge. It seems to me that the fundamental challenges in anti-spam are twofold:
- Your opponents are highly motivated because the profit margins are so high (spam is very cheap to create/send)
- Commercial filters are bound to increase their effectiveness even in the face of every higher volumes of spam every year
I started thinking harder about that last part and I was thinking, “hmmm… what else has a super low signal:noise ratio?” Then I thought of astronomical observation data. I was talking with someone at Amazon about that very thing and he was telling me that astronomical observation data tends to be huge (~2GB second, raw) and that it contained almost no signal. I wonder what techniques they are using to sift out the meat there and whether or not any of them are applicable to spam filtration. Anybody looked into the parity between these two before?
Finally, I got to thinking about some of the techniques that are in use today and I hit upon an interesting thought I’d never come across before. Distributed reputation services are pretty much the de facto standard these days for all commercial vendors. Every vendor has a different pretentious name for these things but, basically, they constantly update centralized databases of sender reputation in near real-time based on the information about emails flowing into their edge systems. These edge systems can be desktop spam filters (e.g. Cloudmark) or big, honkin’ border MTAs like the SMS 8300 or something in between.
I was thinking, “its easy to do that on one MTA”. Just have the MTA keep a database of reputation information and update it incrementally as new mails flow in and the filter renders a verdict. Hell, if you log your filter verdicts, you could just run through that every hour or so if you wanted it to be not so good.
However, then I thought, “hmmm… I wonder what the marginal value of a distributed reputation service is?” Meaning: what’s the difference in value between a purely on-box reputation database versus one that takes feeds (albeit updated more slowly) from a network of border MTAs? Given that all of these services (with the exception of Vipul’s Razor) are commercial I would have expected to see a showdown in some magazine by now. I guess since they are just pieces of a larger product they are not usually featured on their own (except SenderBase , which IronPort can’t shut up about).
Still, I think that’d be an interesting analysis to really profile the ROI and particulars on a distributed reputation service like that versus purely local reputation information. One could find the theoretical optimal number of nodes in said network and find out the scaling model of a service like that along a bunch of different metrics, etc. You could even throw some agent-based modeling at it and simulate Internet conditions to see more pros/cons of each. Has anyone else ever heard of an analysis like this?
Then, I was thinking, “dood its late…”
Closing Notes on the Amazon Startup Challenge
I just got back from the Amazon Startup Challenge event and I had a really great time. Tracy Laxdal, Alicia Nakamoto and the rest of the team did a superb job of organizing and running the event and everyone was engaging and interesting. I even saw Jeff Bezos jump into a car in front of me and dash off ;)
It was a pretty hectic day, with our late flight and subsequent (relatively) early pitch to the judges. Werner Vogels came down and gave us a very intriguing talk on Amazon’s technology focus and some of their internal processes. Then, we did a Q&A round with the product managers of Amazon S3, SQS and EC2 where we got a chance to grill them regarding upcoming products, features, pricing, etc.
After some downtime back at the hotel, the “lightning round” started up. Here, we pitched to 8 VC firms in less than 2 hours. Wheewh That was really intense but very fun. On to cocktail hour, where a bunch of outside guests arrived prior to dinner. I finally got to meet the RightScale guys in person after having seen Thorsten on the AWS forums for months now which was very cool.
Dinner was great and they showed all of our videos from the contest interspersed throughout the meal. Andy Jassy then got up and announced that Ooyala was the winner and had them smash a 1RU server with a golden hammer (signed by Jeff Bezos). This was to indicate that they didn’t need no stinkin’ servers anymore ;-)
For me, it was a really interesting and fun learning experience. I really appreciated the opportunity to meet with all of the great people at Amazon, the other candidates, VCs and sponsors who attended. Everyone was very impressive. The biggest thing for me, however, was how unbelievably good Lucinda was in all of this. I had never seen her in this capacity before and all I can say is WOW. There wasn’t anything anybody threw at her that she didn’t handle magnificently. I have a new level of appreciation and respect for the opportunity I have to work with her.
Three cheers to Amazon for a great event! I look forward to seeing them run it again in the near future because it was definitely of very high value for all.
Amazon Startup Challenge Videos Posted
The videos for the Amazon Startup Challenge were just posted. Go and check them all out .