Today's websites or web pages are very complex structures involving several components. It can be daunting for a beginner to understand it all at once. To ease the process let's break things down and see how the various pieces fit together.
At the core of web pages is the HTML. It is the language that browsers understand and it is what web developers use to showcase rich content. Without HTML all our web pages would have been simple text files. Now I assume that you are aware of HTML and how it is used in web pages. Another critical piece that works closely with HTML is CSS, but we can skip that one for now.
Plain HTML web pages are pretty dull. They are not really interactive. What they mainly have apart from text are :
- Some formatting information which tells the browser how to show the contents.
- Some media content like images or videos.
- A bunch of hyperlinks which lead to other similar web pages.
- In slightly evolved pages there would be forms which allowed a user to send data back to the servers.
It is the third one which helped form this "web" of pages which we call the Internet. How do these links work? Pretty simple actually. Each link typically points to a web page on a server. Whenever a user clicked on a link, the browser would fetch that page, remove the existing page and show the newly downloaded one. Some times the link could point back to the same page in which case clicking on it would just "refresh" the existing page. We all know about the refresh/reload button in the browsers don't we?
There are a few things to note here.
- These basic web pages were dull in the sense that they were not interactive. They did not respond to user actions except for clicks on links, in which case the existing web page goes away and a fresh page is loaded.
- When a link was clicked the browser would decide to send a request to the server. The web page itself did not have much control on when the request should be made.
- When the browser was making a request in response to a link click, there was pretty much nothing you could do with the web page until the request completed. Every request to the server blocked the entire page and if there were multiple requests that had to be made it had to be sequential, one after the other. A fancy technical word to describe this would be "synchronous".
- An obvious observation is that the data that came from server in response to request from a browser was HTML, as in the data and information on how to display that data were sent back from the server all the time.
While the fourth point appears as not such a big problem, the first three are definitely limitations.
To work around the first problem browser developers came up with JavaScript. Browser folks told web developers that they can make their web pages interactive and responsive to user actions by writing code to handle those interactions in this new language called JavaScript and including it in the web pages. Web developers could now write different pieces of code to handle different user actions. The browser provided a few functions (aka APIs) in JavaScript which the web developers could use to make changes to the web page. This API is called DOM (Document Object Model). Thus web pages became interactive and responsive.
The second and third limitations still existed however. The only way to request data from server was through links or in case of evolved pages through forms. Also the web page was one single unit. Whenever something had to change the whole page had to be reloaded. Imagine something like that with your gmail page. How would it be if the page refreshed every time you received a chat message or worse every time one of your contacts logged in or out of chat? That would be disastrous. The fate would be similar of something like a sports website in which the small corner showing the score needs to be updated a lot more frequently than the rest.
To solve this some people thought of iframes as a solution. Iframe is a HTML tag inside which you can put another HTML page. Now with this arrangement the inner page could be reloaded without affecting the outer page. So the sports website could now put the score part in an iframe and refresh it without affecting the rest of the page. Similarly gmail could break up its page into multiple iframes (It actually does this in reality, but it also does a lot more than this).
With iframes the "synchronous" problem was somewhat solved. Multiple requests could now be made in parallel from a single web page. However the method of making the request largely remained the same, using links or forms. It was either not seamless or almost impossible to make requests in any other manner, like say whenever user typed something (like Google search). Now let's see what all capabilities we have on top of the basic web pages.
- Web pages can respond to user actions. Some code can be executed on various user actions made on different parts of the page.
- This code can modify the web page using DOM.
- Browsers are now capable of making multiple requests in parallel without blocking the web page.
The browser folks now combined these abilities thereby allowing web developers to make requests to servers in response to user actions using JavaScript. These requests could be made in parallel without blocking the page i.e. Asynchronously. Now when the response came back the same JavaScript code could add the HTML received in the response to the existing page using DOM.
The developers went a step ahead and said "why receive HTML every time? In most cases the formatting or presentation information is already there in the web page. What is needed is only new data or content (For example, in the sports website example, all that is needed is the updated score. Information on how to show that score is already present in the web page). So instead of fetching HTML every time let's just fetch data to make things much faster and efficient". And back then the most popular way of representing data in a structured manner was XML. So developers started receiving data in XML format from the server and using that to update the web page. XML was so popular that browsers added support for automatically parsing XML data received in response to these requests.
This approach of making requests and updating parts of web pages became very popular and was much widely used. Since this was predominantly used to make requests over HTTP (The protocol of the web) to fetch XML data this technology was called "XMLHttpRequest" (Internet explorer called it something different, it probably still does).
What all does this XHR offer?
- Making requests "Asynchronously"
- Making requests from "JavaScript"
- Making requests for "XML"
Put together it forms "AJAX - Asynchronous JavaScript and XML".
Notes :
- Several things have been simplified here as this is aimed at people new to web development and Ajax in particular. This should nevertheless give a good starting point.
- References to XML should be read as " something other than HTML". Nowadays XML has been largely replaced by JSON because of several advantages.
- XMLHttpRequest does not necessarily mean making asynchronous request. It supports making synchronous requests also. But I don't know if there is a good reason to do it. Making a sync xhr will block the entire page until that request is complete. Really don't do that.