A provocative interview with Microsoft’s Chairman and Chief Software Architect, William H. Gates, in the latest Information Week, in which he discusses work being undertaken by MS’s research campuses, including one in Bangalore.
What struck me was not so much the ‘we’re overtaking Google’ rhetoric, but how they were thinking about large domains of ill-connected information. The focus was on scientific information, but it was interesting that the model he discussed was pulling together astronomical data.
A couple of quotes:
We’re seeing fields of science that have so much data that without our ability to data mine and [manage] work flow and visualize, they can’t make progress. The Sky Server example is sort of typical. In astronomy, historically, you wanted to be lucky enough to be gazing at the stars on a night when something interesting happened, and then you wrote a paper about quasars or something. Today, there are thousands of observation points around the world at different locations, at different wavelengths, different resolutions. There are a couple of satellites–lots of things up in the sky. And if you, as an astronomer, want to say, “Well, galaxies cluster like this, or these light sources work like this”–in order to test that hypothesis, there are thousands of databases in different formats that you have to pull data out of and look at and see if they’re consistent with your hypothesis. What [Microsoft researcher] Jim [Gray] did is he got the astronomers together to see how you could use Web services to create essentially what we call Sky Server, one logical database. It doesn’t mean all the data has to be copied into one place, but you can query it, and it goes out and pulls in the right information. That was a smashing success, but it was based on Jim’s view that there’s so much data in the sciences that without the kind of software management that we have, both in our products and in our research, that they won’t be able to make the rapid advances that they should.
Nowhere is that more true than in biology, life sciences, where you’re just gathering so much data. The ability to connect these data sources together using our very state-of-the-art Web service and visualization approaches is pretty exciting. So it’s not like we woke up one day and said, “Oh, let’s work on some non-software problems.” It’s like if you noticed that engineers were using math, and you said to the mathematicians, “When did you decide to help these poor engineers?” The mathematician would say, “No way we did.” The engineers figured out that the only way to describe the ideas of material strength, and crystal fracturing, and all of these very complicated things, was [through] very deep mathematical techniques. Well, now, mathematics alone isn’t enough. You need software that deals with vast amounts of data.
Part of the problem in legal information is integration. We still poke around in dozens of places. What would connecting law using advanced visualization look like. Are we too professionally and jurisdictionally challenged to contemplate these issues?