[wget-notify] [bug #20329] Make HTTP timestamping use If-Modified-Since

Micah Cowan INVALID.NOREPLY at gnu.org
Fri Aug 29 10:34:25 PDT 2008


Follow-up Comment #3, bug #20329 (project wget):

>From IRC chat with Daniel Stenberg of curl


<Bagder> micahcowan: btw, regarding time-conditional requests (I noticed a
bug entry of yours), in curl we do time checks ourselves as well, in case the
data comes anyway
<Bagder> for cases where the server doesn't support it, or ignore it
<micahcowan> Bagder, talking about the If-Modified-Since thing?
<Bagder> yes
<micahcowan> Bagder, have you encountered servers that give HTTP/1.1 in the
response, and yet will still serve "old" content (complete with a
Last-Modified date that precedes or is identical to the one in
If-Modified-Since)?
<Bagder> I'm quite sure, but I worked on that ages ago so I don't recall the
details
<micahcowan> I suppose it was usually specialized CGI scripts? Or were there
actual servers serving static content that suffered from this?
<Bagder> I don't remember, but given that my additional check worked they
still at least provided Last-Modified: headers
<micahcowan> Bagder, then, in your view, would we be best served by requiring
an explicit --use-if-modified-since (or something with a better name ;) ) to
support that header, regardless of whether we do HTTP/1.1 properly, and cache
information about the server's HTTP version?
<micahcowan> Alternatively, I suppose Wget could assume that HTTP/1.1 servers
DTRT, but also cache information about buggy servers, and stop trying
If-Modified-Since after the first time it has to terminate a connection
because it got a 200 response when it shouldn't have.
<micahcowan> (in further requests to the same server, that is)
<Bagder> not really, I don't think the header will hurt in the 1.0 case, it
just won't be recognized by the server.
<Bagder> and if you get a 200 with a Last-Modified, you can still make your
own secondary check
<Bagder> at least on http1.0 responses
<Bagder> ... assuming they still exist ;-)
<micahcowan> Bagder, so, use If-Modified-Since by default, even now before
we've got HTTP/1.1 support, instead of using HEAD, and just cut the connection
if we get old content anyway?
<Bagder> yes, I think I'd prefer that
<micahcowan> And that's what curl currently does? I think I had
misinterpreted your first few sentences to mean that curl does the HEAD thing
that Wget currently does.
<Bagder> curl never does HEAD
<Bagder> HEAD is unreliable
<Bagder> well, it does HEAD if you ask it explicitly but not otherwise
<micahcowan> Right, I figured that was what you meant. :)
<micahcowan> I think the primary reason Wget has historically used HEAD has
been for timestamping. In which case it proably is reliable enough, if it
bothers to include Last-Modified. But If-Modified-Since would certainly be
preferable; at least if it works in most cases (and especially if it still
works even when we claim to be speaking HTTP/1.0).
<micahcowan> The other, newer, use, is for determining what the local file's
name will be, when we're doing timestamping (or --no-clobber) along with
Content-Disposition stuff. I'm already planning to prefer doing GET and
shutting the connection in the near future: I don't like the extra HEAD
requests, and HEAD is even _less_ reliable than usual when it comes to
Content-Disposition, AFAICT.
<Bagder> right, HEAD makes some sense when talking 1.0 since the
time-conditional requests ain't there then
<micahcowan> Thanks for your experienced input; I appreciate it. :)


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?20329>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/



More information about the wget-notify mailing list