Monday, August 11, 2008

SMashups - PART II

SMashups: Scalable, Server-less, Social Mashups for Web 2.0
SMashups - PART II
Getting Out of the Sandbox
--------------------------
first, we need to understand why getting out of the 'sandbox' is even important. most of the mashups out there in the wild today are nothing more than web-pages, driven by sophisticated back-end web servers that cull information from many different web-services into a single page. Some more sophisticated mashups are now starting to evolve that have AJAX interfaces, dynamically driven by connecting to the mashup server. today we see impressive automatic, real-time updating tickers on stocks, for example; or, we can collaborate live on documents using only a webbrowser, and even combine web services from many different web service providers to create a single 'mashup' view of the web.

but therein lies the weakness to scalability. with a mashup, when the client needs information from several web services for a single request, the client must first go to a 'single' server that is programmed to converge the various data streams into a single response to the client. this wrecks havoc on server systems, and detracts from scalability. i have read many articles about a very popular mashups crashing DNS systems, or consuming massive bandwidth and processing power. in technical terms, a bottleneck.

what happens when there are thousands, millions, or even tens of millions of clients? i have read articles about mashup servers grinding to a halt under the load, or bringing down some unsuspecing ISP DNS system. Sure, the server works fine responding to clients, and actually scales quite well. again, the problem is that for each client request, the server has to query many external servers on the back end, multiplied by the number of users accessing the system at any one time; quickly tapping all system resources.

in an ideal world, mashups would not need to go through the bottleneck server. rather, the mashup application would reside on the client, and the client would gather all of the web-service resources from the various servers out in the web, and then integrate the results right there on the client--but such a thing was previously not possible. why? well, i'm sure you probably already know the answer to that question, but for those newbies out there, i'll take some time to explain, briefly.

while AJAX is all the rage now days, the technology itself has been around for well over 10 years now, just silently taking root throughout the web. in fact, ajax as a term wasn't even used before 2005 (see jesse james garrett's original article here: http://adaptivepath.com/ideas/essays/archives/000385.php). however, i was using these same technologies in 1999. in fact, there were others who introduced me to the technology who had been using it for some time before then. my point is that ajax isn't really all that new. what was needed for the concept to really go mainstream was the name. also, just so everyone knows, AJAX is NOT an acronym. it's a big debate, and i just wanted to weigh in with my opinion on the issue. while i do not want to rehash all the information in his article, i do want to allude to something that jesse james wrote at the end of his article: "The biggest challenges in creating Ajax applications are not technical. The core Ajax technologies are mature, stable, and well understood. Instead, the challenges are for the designers of these applications: to forget what we think we know about the limitations of the Web, and begin to imagine a wider, richer range of possibilities."

back to the point of this article. although ajax is not really a single technology itself, but rather a blend of technologies, everything aside, the most important aspect of ajax is the xmlhttprequest, which allows programs to post data back through the website, without having to refresh the entire page to make a server trip. this allows developers to send smaller snippets of data, and update portions of the screen using dynamic data, drastically improving the user experience. but xmlhttprequest came with a string attached. namely, one could only post an xmlhttprequest back to the same domain where the original page was loaded from. effectively, this 'sandbox' protects the user. in other words, the xmlhttprequest object FORCES the client to use just a single server, creating the mashup bottleneck dilemna. furthermore, because truly effective mashups use web services from several servers, the more sophisticated mashups actually exponentially increase the demands on the server. so, wouldn't it be great if there were a way to let the client collect the data from the various sources directly, without having to go through a bottleneck mashup server first? "Ah! But Wait!", i hear you cry, "the whole point of the 'bottleneck' server is to get around the browser 'sandbox', because clients cannot make XMLHttpCalls to foreign domains. but it's just not so, anymore.

i'll explain why the sandbox doesn't matter, shortly. But first, consider the implications. Many mashup developers would immediately be able to get rid of their servers and run the entire mashup within a client...of course, enterprising organizations may store information on the servers for logging etc., so a server would still be necessary for those purposes. but, almost everything happening on the client would be able to take place without impacting the server (i.e. bandwidth, processing, DNS, even temporary storage and retrieval space). a majority of mashups that I have researched could easily accomplish the same application completely without a server!

to make this possible, we somehow have to get around that blasted sandbox so that our client can call to any domain on the internet, directly, without having to go through a bottleneck server. we can accomplish this thanks to what i call the social container, such as OpenSocial container, as one example. of course, we can use almost any social networking platform, which leaves facebook, windows live (think popfly), and many others social containers that provide context. However, for the sake of this article, I am going to focus on the OpenSocial container as a means to creating SMashups. but i will also provide url's to resources into other social containers. but back to the problem of the sandbox. fortunately for us, the opensocial container provides just the answer, in the form of:


gadgets.io.makeRequest(url, callback, params);

granted, this is actually a call to gadget.io, not opensocial, so why all the hubbub about social context and containers? We'll get into that later, but keep in mind the three S's of SMashup (scalability, serverless, and social). also keep in mind that opensocial directly supports google gadgets, but that other social containers also support gadgets and/or widgets. Of course, as I keep repeating, we'll be focusing on gadgets and opensocial in this article.

This changes everything, as you'll see. First, rather than programming the client to make a call to a mashup server, forcing the mashup server to hit the other web-services and return an integrated result, we can now program the client to gather (and even cache) resources on the client-side. from the mashup developer perspective, we completely eliminate the need for the mashp server. even for enterprises, this represents a huge opportunity, allowing companies to focus their servers on providing core-business web services, and offloading mashup development to the client--freeing resources. now, granted, one could argue that effectively, we haven't technically changed a thing...and that it's just we are using the social container to act as proxy, effectively just offloading the 'mashupping' onto their servers

...and that's just the point. never before in history was it possible to have such a large audience able to access and run ajax applications that could call out to many, many servers without having to worry about programming a 'dedicated' server. this technology now allows mashup developers to focus on developing mashup applications, skipping all the costs and hassle associated with even the most trivial IT operations...

so, we see that we have a powerful new way to communicate from the client. but gadgets.io.makeRequest() is not without its flaws. in this case, the weakness is it's very general nature. makerequest() is great for fetching data, but it doesn't give us, as developers, anything to hold onto, and no context when it returns, unlike XMLHttp which provided a nice little object that we could use, it gave us onError, onSuccess, and onTimeout callback handlers, and when we recieved a callback, the object was already loaded with response data. we could hold the object in memory and manipulate it's values...

...that's not quite the case with makerequest(). with makerequest, we tell it which url we want to fetch from (url), what function to call on return (callback), and which params to use. if we are issuing a post, for example, then we set the appropriate params to indicate a post, and also another parameter to carry the data (params is an array of parameters). here's how to use makerequest();

function example(){
var url = "http://www.Rhapsody.com/API/someservice?someparameter=1";
var _params={};
_params[gadgets.io.RequestParameters.CONTENT_TYPE]=gadgets.io.ContentType.DOM;
google.io.makeRequest(url, handleresults, params
}
function handleresults(response){

if(response.errors.lengh > 0){
alert(response.errors[0]);
} else{
alert(response.data.xml);
}
}


one more thing worth pointing out for consideration is that during development, it's often useful to be able to grab new information, even if you just accessed the information recently. this can be a problem in opensocial because the container usually caches requests for some period of time. therefore, even if you update something that you are trying to load, when you call for it you may only get what is in cache, not the data you updated. to overcome cache, we use a very old technique to append essentially a timestamp onto the end of the url request, effectively confusing cache into thinking this is a new request to a new url :) here's the code (which is available on the opensocial api tutorial page).


function makeCachedRequest(url, callback, params, refreshInterval) {
var ts = new Date().getTime(); var sep = "?";
if (refreshInterval && refreshInterval > 0) {
ts = Math.floor(ts / (refreshInterval * 1000));
} if (url.indexOf("?") > -1) {
sep = "&";
}
url = [ url, sep, "nocache=", ts ].join("");
gadgets.io.makeRequest(url, callback, params);
}

using this makeRequest wrapper, we can easily overcome cache. we can set the refresh interval for 1000 milliseconds if we'd like, or drop it to 1 ms. of course, we would want to only over-ride cache for development purposes, because most of the information would not change in a mashup query from minute to minute. for example, if we search against amazon api for books lists, if the user searches the same terms over and over, the cache automatically responds, saving the round-trip :)

Also, we'll create two more functions, just to help us get started...

function xmlhttpGet(url, responsehandler, refreshinterval){
var _params={};
_params[gadgets.io.RequestParameters.METHOD] = gadgets.io.MethodType.GET;
_params[gadgets.io.RequestParameters.CONTENT_TYPE]=gadgets.io.ContentType.DOM;
makeCachedRequest(url, responsehandler,_params,refreshinterval);
}
function xmlhttpPost(url, postdata, responsehandler, refreshinterval){
var _params={};
_params[gadgets.io.RequestParameters.METHOD] = gadgets.io.MethodType.POST;
_params[gadgets.io.RequestParameters.POST_DATA] = postdata;
_params[gadgets.io.RequestParameters.CONTENT_TYPE] = gadgets.io.ContentType.DOM;
makeCachedRequest(url, responsehandler, _params, refreshinterval);
}

most likely, we'll only be using GET. later in the series, i'll show you how to use the social container to store persistent data, freeing up the server from having to store most trival profile / savings / favorites data :)

however, there's one other design consideration to take into account when designing your code--namespaces...keep in mind that your code could be running in any number of contexts, much of which you will not have any control over. in this type of a platform, the chances of naming conflicts increases dramatically. for this reason, try not to use global variables, or even global functions. instead, we'll use encapsulation to create all of our functions within a namespace. from this point forward, i'll be using the nolyXMLHTTP namespace (you can use whatever you'd like, of course)...therefore, the code in this section would be included as follows:

function nolyXMLHTTP(){
//these are all 'static' functions, meaning they do not access any instance fields/data
function makeCachedRequest = new function(url, callback, params, refreshInterval) {
var ts = new Date().getTime();
var sep = "?";
if (refreshInterval && refreshInterval > 0) {
ts = Math.floor(ts / (refreshInterval * 1000));
} if (url.indexOf("?") > -1) {
sep = "&";
}
url = [ url, sep, "nocache=", ts ].join("");
gadgets.io.makeRequest(url, callback, params);
}
function getRequest = function(url, responsehandler, refreshinterval){
var _params={};
_params[gadgets.io.RequestParameters.METHOD] = gadgets.io.MethodType.GET;
_params[gadgets.io.RequestParameters.CONTENT_TYPE]=gadgets.io.ContentType.DOM;
nolyXMLHTTP.makeCachedRequest(url, responsehandler,_params,refreshinterval);
}
function postRequest(url, postdata, responsehandler, refreshinterval){
var _params={};
_params[gadgets.io.RequestParameters.METHOD] = gadgets.io.MethodType.POST;
_params[gadgets.io.RequestParameters.POST_DATA] = postdata;
_params[gadgets.io.RequestParameters.CONTENT_TYPE] = gadgets.io.ContentType.DOM;
nolyXMLHTTP.makeCachedRequest(url, responsehandler, _params, refreshinterval);
}
}

doing this allows us to write code such as the following:

function main(){
nolyXMLHTTP.getRequest(someURL, callbackhandler, 1);
//or
nolyXMLHTTP.postRequest(someURL, someData, callbackhandler, 1);
}
function callbackhandler(response){

if(response.errors.lengh > 0){
alert(response.errors[0]);
}
else{
alert(response.data.xml);
}
}

it may not seem like we have made much progress. after all, the only difference from the first code that I wrote (use .makerequest() directly) and this last code (using our namespace), is that this code calls through our shallow wrapper, providing us with separate get and post methods. however, there is a method to this madness, which we'll go over in part iii. you'll find part iii really rewarding, as we'll start toward building our own XMLHTTPRequest object that gives us most of the core functionality we expect from the standard XMLHTTPRequest object we all know and love. But it takes a bit of voodoo to make that happen, which i hope everyone will appreciate :)

until part iii...

...happy coding,
nolybab praetorius
END OF PART II

No comments:

Post a Comment

considerate others please