Pages

    Monday, August 10, 2009

    An Analyst's Development Environment


    Here in the land of academic research we're working with a "new" take on mashups. It seems like a no-brainer to me but a lot of people have expressed interest and surprise when I explain to them what we're doing. For now let's call it an analyst's development environment (ADE).

    One thing that mashups are really, really good at is taking disparate data sources and allowing "momentary" relationships in the sources to be created. This in effect creates a new data source that is a fusion of the inputs. As is often the case in fusions, this new source tends to be more than just the sum of the parts. You often come up with new views on the data as you add extra sources.

    Most people stop here at the fusion stage. Once they have the new view onto the data they rely on other tools outside the scope of a mashup to do interesting things. They might pipe that data into a tool such as Fusion Charts in order to visualize it or they might pipe it into an analysis tool such as a model or sim. But, why do they need to leave the scope of the mashup to do this? What if that analysis or the creation of the Fusion Charts XML was an automated part of the mashup itself?

    Mashups deal with web services primarily (though there are some nifty products out there that allow you to mash more than just web services). A web service is usually considered to be a data source. But, in practice they are much more than that. Consider all of the specialized web services provided by Google for geolocation or Amazon for looking up aspects of books. The simplest example I can give you is Google's web service which converts an address to a lat and long pair (called geocoding). With these in mind let's take a different look at web services. Let's look at them as processing units.

    A processing unit has 3 criteria: it takes input; does something interesting with that input; and provides output. Processing units are the basis of modern programming. They're known as methods, functions, procedures, etc. depending on context. We can most often build bigger processing units from simpler units.

    Web services fit these 3 criteria handily. You can easily provide input, they can easily do something interesting with that input and then just as easily provide output. All communication is done in a standardized protocol driven environment.

    The interesting thing about web services is that we can string them together (with the right tools) rather easily into processes. That's exactly what we're doing here. Each web service is either a data source or a processing unit. Given the ability to ferry data from one web service to the next (in an easy way) it is possible to create mashups that do more than just mash data. They actually do some form of processing.

    Consider what it would be like if you had a web service endpoint attached to a model? You could pre-mash your data from various sources then run it all through the model and create a new output that would be very interesting. It would be so easy.

    Using Presto we recently put together a demo which worked along these lines. It made our demo come together in several weeks rather than over several months. We used Presto to access databases then ferried that data (in XML format) into a custom built web service that took said data and ran XSL transforms on it. That produced Fusion Charts XML which we then piped into our presentation layer for visualization. It was easy.

    Here is a diagram of what the actual flow of the mashup was.

    Here is a screen shot of the actual chart produced by the generated Fusion Charts XML.


    An ADE would work in a similar way. Using provided tools which allow for ferrying of data from one endpoint to another and given a grab-bag of analysis and transformation web services an analyst could create some amazing things with little effort or technical know-how. The only developer support would be in the creation of any custom web services. It could be a very powerful tool.

    1 comment:

    Rob Shell said...

    I love the whole model! :-)