PHP
  Home arrow PHP arrow Watching The Web
Dev Shed Forums 
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Sun Developer Network 
Dedicated Servers 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Moblin 
JMSL Numerical Library 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PHP

Watching The Web
By: The Disenchanted Developer, (c) Melonfire
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 6
    2002-10-23

    Table of Contents:
  • Watching The Web
  • Code Poet
  • Digging Deep
  • Backtracking
  • Plan B
  • Closing Time

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Watching The Web


    (Page 1 of 6 )

    So there I was, minding my own business, working on a piece of code I had to deliver that evening, when the pretty dark-haired girl who sits in the cubicle behind me popped her head over and asked for my help.

    "Look", she said, "I need your help with something. Can you write me a little piece of code that keeps track of Web site URLs and tells me when they change?"

    "Huh?", was my first reaction...

    "It's like this", she explained, "As part of a content update contract, I'm in charge of tracking changes to about thirty different Web sites for a customer, and sending out a bulletin with those changes. Every day, I spend the morning visiting each site and checking to see if it's changed. It's very tedious, and it really screws up my day. Do you think you can write something to automate it for me?"

    Now, she's a pretty girl...and the problem intrigued me. So I agreed.{mospagebreak title=A Little Research} The problem, of course, appeared when I actually started work on her request. I had a vague idea how this might work: all I had to do, I reasoned, was write a little script that woke up each morning, scanned her list of URLs, downloaded the contents of each, compared those contents with the versions downloaded previously, and sent out an email alert if there was a change.

    Seemed simple - but how hard would it be to implement? I didn't really like the thought of downloading and saving different versions of each page on a daily basis, or of creating a comparison algorithm to test Web pages against each other.

    I thought there ought to be an easier way. Maybe the Web server had a way of telling me if a Web page had been modified recently - and all I had to do was read that data and use it in a script. Accordingly, my first step was to hit the W3C Web site, download a copy of the HTTP protocol specification, from ftp://ftp.isi.edu/in-notes/rfc2616.txt, and print it out for a little bedside reading. Here's what I found, halfway through: The Last-Modified entity-header field indicates the date and time at which the origin server believes the variant was last modified.There we go, I thought - the guys who came up with the protocol obviously anticipated this requirement and built it into the protocol headers. Now to see if it worked...

    The next day at work, I fired up my trusty telnet client and tried to connect to our intranet Web server and request a page. Here's the session dump:
    $ telnet darkstar 80
    Trying 192.168.0.10...
    Connected to darkstar.melonfire.com.
    Escape character is '^]'.
    HEAD / HTTP/1.0
    HTTP/1.1 200 OK
    Date: Fri, 18 Oct 2002 08:47:57 GMT
    Server: Apache/1.3.26 (Unix) PHP/4.2.2
    Last-Modified: Wed, 09 Oct 2002 11:27:23 GMT
    Accept-Ranges: bytes
    Content-Length: 1446
    Connection: close
    Content-Type: text/html
    Connection closed by foreign host.
    As you can see, the Web server returned a "Last-Modified" header indicating the date of last change of the requested file. So far so good.

    More PHP Articles
    More By The Disenchanted Developer, (c) Melonfire


     

       

    PHP ARTICLES

    - Building a Database-Driven Application with ...
    - User Authentication for a Project Management...
    - Introduction to the CodeIgniter PHP Framework
    - Adding Users for a Project Management Applic...
    - Migrating Class Code for a MIME Email to PHP...
    - Login and Logout Authentication for a Projec...
    - Composing Messages in HTML for MIME Email wi...
    - Project Management: Authentication
    - A Better Way to Determine MIME Types for MIM...
    - Project Management Overview
    - Handling Attachments in MIME Email with PHP
    - Completing the Project Management Application
    - Sending MIME Email with PHP
    - Handling Files for a Project Management Appl...
    - Viewing and Editing Tasks for a Project Mana...





    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway