Everyone knows Http is the underlying concept of Web. The base concept is a request is initiated by a browser or web-client and you get a response back. Some day’s back I noticed some interesting slides from Alessandro Nadalin’s talk Mixing the history of HTTP, SPDY and HTTP/2.0
It really inspired me to look more into the Http. I also understood a fact that I was missing many basic stuffs. I thought of playing with it.
Note : This is just experimental stuffs which I tried to learn looking at various sources. Things can be wrong also, feel free to correct if you think so.
Normally you don’t care about the underlying http headers. There are status codes which are passed before an actual content is send. So the browser knows the state whether the content is available in server, what content-type is coming, for eg like html, json, xml etc. The content-length, accept-language, etc. You can see a list of http request and response headers at http://en.wikipedia.org/wiki/List_of_HTTP_header_fields.
So we are going to create the headers other than the normal apache header. By default the status is 200 OK.
If you notice the broswer headers you can see something like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
and the respose source as
So you may not have seen any benefict in the above for its what you normally get. Now lets enter to the world of http caching.
You may have heard about Expires, Cache-Control, Last-Modified, Etag headers. Yes with the help of these we can make use of http caching. Let’s see a base example.
What will happen when you request the same url? It reloads the page. So the base concept is same, the client requests to server and server responds. But the scenario changes, if the browser is asking with some additional headers like If-Modified-Since or If-None-Match.
If the response from server has set Last-Modified header value, then the request initiated by broswer will be having If-Modified-Since header.
1 2 3
Now if you look into the Request Headers, you can notice the If-Modified-Since is send.
1 2 3 4 5 6 7 8 9
But we weren’t validating the data to make use of caching. What we need to do is check whether the server variable HTTP_IF_MODIFIED_SINCE is set. If its set, we need to validate whether its being modified. If both are same we just need to send a 304 status code which means not modified with out any content. I am hard coding for demonstration purpose.
1 2 3 4 5 6 7 8
So in this case if you reload the page, it will send a 304 not modified header. Getting the 304 header browser knows its not modified. This is too fast and we save a huge band width and server processing. Isn’t it too good? I recall his slides 21 years of http which still not used very nicely :–/
Let’s move on to E-tag. As you have noticed the problem with Last-Modified and Expires is it uses GMT time. If the server and the client are not sync, then its hard. Thus the E-tag is introduced. You can create a md5 hash and set the E-tag header.
If you are setting a E-tag, then you can use the server variable HTTP_IF_NONE_MATCH to validate it and send a status of 304.
1 2 3 4 5 6 7 8 9 10 11 12
Note : In the above examples I have hard coded the time and md5 hash. But in the real world you should use file modified time or something like that depending on the use case.
Status Code : http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
Caching in Http : http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html