decreased caching efficiency?
Hey gang, Has anyone else around here noticed a decrease in caching efficiency over say, the past year or so? Seems we've seen a radical drop (order of magnitude). Seems popular sites are using more and more entirely dynamic, rapidly changing content.. If there's indeed a reduction in efficiency, caches simply introduce more transactional latency and provide no benefit to offset cost. What do people consider reasons to be to keep caching in the network? Have caching infrastructures materialized as starting points for content distribution, or have you guys ultimately rebuilt your infrastructure to serve that specific purpose? Faced with high hw cost, licensing fees, and reduced efficiencies, it appears business cases to keep the caches in the network (with all the effort and uglyness it takes to maintain exclude lists) seems difficult to make. Cheers, Chris PS: if you feel there's a place better suited to discussing this, shoot me a pointer. thx. -- Christian Kuhtz Architecture, BellSouth.net <ck@arch.bellsouth.net> -wk, <ck@gnu.org> -hm Atlanta, GA "Speaking for myself only."
On Thu, Oct 19, 2000, Christian Kuhtz wrote:
Hey gang,
Has anyone else around here noticed a decrease in caching efficiency over say, the past year or so? Seems we've seen a radical drop (order of magnitude). Seems popular sites are using more and more entirely dynamic, rapidly changing content..
If there's indeed a reduction in efficiency, caches simply introduce more transactional latency and provide no benefit to offset cost. What do people consider reasons to be to keep caching in the network? Have caching infrastructures materialized as starting points for content distribution, or have you guys ultimately rebuilt your infrastructure to serve that specific purpose?
I don't know anything about the CDN guys (as I don't work for a CDN company directly or indirectly, regardless of what $BIG_NUMBER of people say/think) but the decrease in caching "efficiency" has to do with a very very poor understanding of what is actually possible with caching, both forward and reverse. I'm sure a lot of us cache people could go into great detail about how even highly-dynamic sites like yahoo, slashdot and hotmail could become cache-friendly, but it seems that: * people are lazy / uneducated * people *WANT* the traffic
Faced with high hw cost, licensing fees, and reduced efficiencies, it appears business cases to keep the caches in the network (with all the effort and uglyness it takes to maintain exclude lists) seems difficult to make.
Which cache products are you using? Some of the current and upcoming work with squid will probably change this, if I get my way. ;-)
PS: if you feel there's a place better suited to discussing this, shoot me a pointer. thx.
Hrm. Perhaps the wrec ietf group might be a better place to ask this type of question? In general, I think that theyre are way too many web-developers out there who are ignorant of just how simple yet powerful the caching primitives in HTTP/1.1 are, and too many companies who are just interested in more damned traffic. :-) 2c, Adrian -- Adrian Chadd "It was then that I knew that I wouldn't <adrian@creative.net.au> die, as a doctor wouldn't fart in front of a dying boy." -- Angela's Ashes
In general, I think that theyre are way too many web-developers out there who are ignorant of just how simple yet powerful the caching primitives in HTTP/1.1 are, and too many companies who are just interested in more damned traffic. :-)
Adrian
-- Adrian Chadd "It was then that I knew that I wouldn't <adrian@creative.net.au> die, as a doctor wouldn't fart in front of a dying boy." -- Angela's Ashes
Adrian, On this end of the pond, many of us measure success with MRTG and can very accurately plot profits using the same OIDs. You are absolutely right. When you're in the business of selling bandwidth, that is what you are interested in. Not being on the other end of a sat link helps I suppose. --- John Fraizer EnterZone, Inc
John Fraizer wrote:
In general, I think that theyre are way too many web-developers out there who are ignorant of just how simple yet powerful the caching primitives in HTTP/1.1 are, and too many companies who are just interested in more damned traffic. :-)
Adrian
-- Adrian Chadd "It was then that I knew that I wouldn't <adrian@creative.net.au> die, as a doctor wouldn't fart in front of a dying boy." -- Angela's Ashes
Adrian,
On this end of the pond, many of us measure success with MRTG and can very accurately plot profits using the same OIDs. You are absolutely right. When you're in the business of selling bandwidth, that is what you are interested in. Not being on the other end of a sat link helps I suppose.
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer: 1. It interferes with content in many cases (web site visitors may see cached pages instead of current content). I know cache products claim this doesn't happen, but it has, and often. 2. The website owner loses information on how many visitors are coming to the site. 3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour). 4. Banner advertising may or may not display properly when caching is involved, thereby costing the website money. 5. There's NOTHING in it for the website owner, other than the possibility that SOME pages might display faster for SOME users. If folks running networks really think website designers and owners should care about caching, then there needs to be some sort of benefit (perhaps paid in dollars) to those affected. Otherwise, there's little reason for them to care. -- ----------------------------------------------------------------- Daniel Senie dts@senie.com Amaranth Networks Inc. http://www.amaranth.com
On Thu, 19 Oct 2000, Daniel Senie wrote:
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer:
1. It interferes with content in many cases (web site visitors may see cached pages instead of current content). I know cache products claim this doesn't happen, but it has, and often.
2. The website owner loses information on how many visitors are coming to the site.
3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour).
Hmmm... Anyone ever considered addressing this via some sort of log passing protocol or somesuch?
On Thu, Oct 19, 2000, Patrick Greenwell wrote:
3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour).
Hmmm... Anyone ever considered addressing this via some sort of log passing protocol or somesuch?
.. has anyone considered the usefulness of a 1x1 gif coming off every page on a site which had an expiry time of 0 whilst the rest of the website has useful cache information? :-) Even a transparent gif ? And in the <IMG SRC .. > tag you specify the width/height as being 1x1 so the page can be rendered whilst waiting for the gif ? Adrian -- Adrian Chadd "It was then that I knew that I wouldn't <adrian@creative.net.au> die, as a doctor wouldn't fart in front of a dying boy." -- Angela's Ashes
At 09:45 10/19/00 -0700, Patrick Greenwell wrote:
On Thu, 19 Oct 2000, Daniel Senie wrote:
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer:
1. It interferes with content in many cases (web site visitors may see cached pages instead of current content). I know cache products claim this doesn't happen, but it has, and often.
2. The website owner loses information on how many visitors are coming to the site.
3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour).
Hmmm... Anyone ever considered addressing this via some sort of log passing protocol or somesuch?
This is something identified by Martin Hamilton in the IETF wrec working group's "Known Problems" document (draft-ietf-wrec-know-prob-02.txt) but which was identified as out of scope for the group. That said, the work being proposed in the content peering community (http://www.content-peering.org/) identifies a need for something similar. As Scott said, there's also RFC2227, though that doesn't appear to have much support.
On Thu, Oct 19, 2000 at 12:28:04PM -0400, Daniel Senie wrote:
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer:
Agreed. It's an annoyance. It could be considered a cost of doing business...
1. It interferes with content in many cases (web site visitors may see cached pages instead of current content). I know cache products claim this doesn't happen, but it has, and often.
Then the content probably needs fixing. (eg with caching primitives)
2. The website owner loses information on how many visitors are coming to the site.
Then the content probably needs fixing. (make the smallest object uncachable, perhaps with use of caching primitives, an empty cgi, a bit of javascript etc)
3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour).
Sometimes that's just tough luck (some people use caches/proxies to protect their own identities). Other times the content could be fixed (cookies. ugh.)
4. Banner advertising may or may not display properly when caching is involved, thereby costing the website money.
This is an orthogonal problem. The website owner doesn't control whether or not caching is involved so she should pay more attention to making sure that the banner advert is done in a way that works in either case! The secondary issue of needing to know when banner ads are seen... well, maybe the content could be fixed to help that too...
5. There's NOTHING in it for the website owner, other than the possibility that SOME pages might display faster for SOME users.
Don't website owners pay for bandwidth anymore? :-)
If folks running networks really think website designers and owners should care about caching, then there needs to be some sort of benefit (perhaps paid in dollars) to those affected. Otherwise, there's little reason for them to care.
Some website owners need to remember that some Internet users don't use a cache/proxy out of choice, but because their ISP forces all (web) traffic through it. Not taking caching into account when designing a website is fine if the website owner knows the setup that all her customers will be using (like an internal corporate website), and it's fine if the owner doesn't want all those things you mention to work with all the audience, but it seems like a silly idea for -this- Internet. Regards, Andrew -- andrewb@demon.net
On Thu, Oct 19, 2000 at 06:36:03PM +0100, Andrew Bangs wrote:
5. There's NOTHING in it for the website owner, other than the possibility that SOME pages might display faster for SOME users.
It's worse than that. Many large providers, including at least a couple of tier 1's that I know of, are transparently proxying port 80 traffic for purposes of caching. If you don't make your pages cache friendly, you are potentially sacrificing a large amount of 'traffic' as people behind such caches will experience problems with the page. Making your pages inoperable with caches will not eliminate the caching; it'll simply send your traffic to your competitors.
If folks running networks really think website designers and owners should care about caching, then there needs to be some sort of benefit (perhaps paid in dollars) to those affected. Otherwise, there's little reason for them to care.
Think again. When potential customers dollars walk to the competition, you will start caring. I'm not going to pay you so that my customers can visit your site and buy things from you -- that's completely out of the question. --msa
On Thu, Oct 19, 2000 at 12:12:36PM -0700, Majdi S. Abbas wrote:
On Thu, Oct 19, 2000 at 06:36:03PM +0100, Andrew Bangs wrote:
5. There's NOTHING in it for the website owner, other than the possibility that SOME pages might display faster for SOME users.
No, I didn't write that. I quoted it in reply to Daniel Senie. Regards, Andrew -- andrewb@demon.net
On Thu, 19 Oct 2000, Majdi S. Abbas wrote:
If folks running networks really think website designers and owners should care about caching, then there needs to be some sort of benefit (perhaps paid in dollars) to those affected. Otherwise, there's little reason for them to care.
Think again. When potential customers dollars walk to the competition, you will start caring. I'm not going to pay you so that my customers can visit your site and buy things from you -- that's completely out of the question.
--msa
Your customers are paying YOU to provide end-to-end connectivity on the internet. If that involves a settlement based peering arrangement, you're saying that it is completely out of the question? Have fun with your non-connected network. I'm betting on your customers dollars will RUNNING to the competition. --- John Fraizer EnterZone, Inc
On Thu, 19 Oct 2000, Andrew Bangs wrote:
On Thu, Oct 19, 2000 at 12:28:04PM -0400, Daniel Senie wrote:
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer:
Agreed. It's an annoyance. It could be considered a cost of doing business...
I find it strange that you used the "cost of doing business" arguement. You see, the website owner IS WILLING to pay for his content to be delivered. It seems that there are a lot of providers out there who are unwilling to pay the toll for that content to enter their network and be delivered to their customer -- the person who is paying them to do just that -- pass 1's and 0's from endpoint to endpoint, WITHOUT TAMPERING WITH THEM!!! --- John Fraizer EnterZone, Inc
Daniel Senie wrote:
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer:
1. It interferes with content in many cases (web site visitors may see cached pages instead of current content). I know cache products claim this doesn't happen, but it has, and often.
In reality, there are very few things that are dynamic. I don't honor zero time expiry. Even MRTG doesn't need a granularity of less than 30 seconds.
2. The website owner loses information on how many visitors are coming to the site.
Why should the ISP care about a faulty model? Does the website owner pay the ISPs to collect such information?
3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour).
People go to websites to learn information. They don't go to websites to involuntarily give information. It's a de facto privacy violation.
4. Banner advertising may or may not display properly when caching is involved, thereby costing the website money.
Not all business plans are viable. Click-throughs may work properly, but impressions do not make any sense.
5. There's NOTHING in it for the website owner, other than the possibility that SOME pages might display faster for SOME users.
Don't website owners have to pay for bandwidth?
If folks running networks really think website designers and owners should care about caching, then there needs to be some sort of benefit (perhaps paid in dollars) to those affected.
If website owners don't properly interact with caching, then there needs to be some sort of benefit (definitely paid in dollars) to those affected. WSimpson@UMich.edu Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
I vehemently disagree with the statement that impressions do not make any sense, only clickthroughs. There is such a thing as brand awareness, a situation where a banner ad is good for itself even if it doesn't lead to click through. It is NOT for YOU to decide what business model makes sense for MY business relationship with MY advertisers. I pay my ISP to carry IP packets around. Caching is acceptable in some cases but not in others. In some cases certainly your cache is in fact a copyright violation. In general I do not want people to have my photos stored in their browser cache (much less permanently saved). I do actually have plans to change around some things in my site to take advantage of browser and network caching (e.g. putting the style sheet in a separate file, ditto the JavaScript and any other constant information I can). When I switch to CGI-based delivery of images the cache will of course become pass-through since there will be no file to cache just a stream of bytes.... ----- Original Message ----- From: "William Allen Simpson" <wsimpson@greendragon.com> To: <nanog@merit.edu> Sent: Friday, October 20, 2000 8:24 AM Subject: Re: decreased caching efficiency?
Daniel Senie wrote:
It might be worth thinking about the problem from the other end. From a web site owner's perspective, caching is a major annoyance. Here are the arguments you may encounter from a web site owner or web developer:
1. It interferes with content in many cases (web site visitors may see cached pages instead of current content). I know cache products claim this doesn't happen, but it has, and often.
In reality, there are very few things that are dynamic. I don't honor zero time expiry. Even MRTG doesn't need a granularity of less than 30 seconds.
2. The website owner loses information on how many visitors are coming to the site.
Why should the ISP care about a faulty model? Does the website owner pay the ISPs to collect such information?
3. The website owner loses the demographics on where visitors are coming from, and especially the number of unique visitors. (It's not helpful to know that one cache engine visited, if that cache engine equated to 10,000 visits in an hour).
People go to websites to learn information. They don't go to websites to involuntarily give information. It's a de facto privacy violation.
4. Banner advertising may or may not display properly when caching is involved, thereby costing the website money.
Not all business plans are viable. Click-throughs may work properly, but impressions do not make any sense.
5. There's NOTHING in it for the website owner, other than the possibility that SOME pages might display faster for SOME users.
Don't website owners have to pay for bandwidth?
If folks running networks really think website designers and owners should care about caching, then there needs to be some sort of benefit (perhaps paid in dollars) to those affected.
If website owners don't properly interact with caching, then there needs to be some sort of benefit (definitely paid in dollars) to those affected.
WSimpson@UMich.edu Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
Dana Hudes wrote:
I vehemently disagree with the statement that impressions do not make any sense, only clickthroughs. There is such a thing as brand awareness, a situation where a banner ad is good for itself even if it doesn't lead to click through.
Of course, in that case, the benefit is to the advertiser. That is, they get the benefit, but you don't get paid. Not my problem. That seems to follow "not make any sense", but YMMV.
It is NOT for YOU to decide what business model makes sense for MY business relationship with MY advertisers.
Nope. You can have any business relationship you'd like. But, by the same token, it is not for *YOU* to decide that *I* have to pay to support YOUR business decision. Last time I looked, there's no constitutional right that guarantees that you can make money.
I pay my ISP to carry IP packets around.
But, you don't pay ME to carry your IP packets around. My customers pay me. I pay my upstream. Therefore, I pay my upstream as little as possible.
In some cases certainly your cache is in fact a copyright violation.
Interesting, if true. Perhaps you could provide a citation? WSimpson@UMich.edu Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
----- Original Message ----- From: "William Allen Simpson" <wsimpson@greendragon.com> To: <nanog@merit.edu> Sent: Friday, October 20, 2000 10:04 AM Subject: Re: decreased caching efficiency?
Dana Hudes wrote:
I vehemently disagree with the statement that impressions do not make any sense, only clickthroughs. There is such a thing as brand awareness, a situation where a banner ad is good for itself even if it doesn't lead to click through.
Of course, in that case, the benefit is to the advertiser. That is, they get the benefit, but you don't get paid. Not my problem.
That seems to follow "not make any sense", but YMMV.
No, you are interfering with my revenue stream by preventing my getting credit for the banner impression.
It is NOT for YOU to decide what business model makes sense for MY business relationship with MY advertisers.
Nope. You can have any business relationship you'd like. But, by the same token, it is not for *YOU* to decide that *I* have to pay to support YOUR business decision.
Last time I looked, there's no constitutional right that guarantees that you can make money.
I pay my ISP to carry IP packets around.
But, you don't pay ME to carry your IP packets around. My customers pay me. I pay my upstream. Therefore, I pay my upstream as little as possible.
right. your customers pay you to get them the packets they asked for and if they want to visit my site and see my content and your cache breaks that, you're not delivering what your customers requested. My site won't deliver content in most of the pages without the ads displaying. At the moment there is a timeout built in while I wait for the ad network to increase server capacity to meet demand. It will go away.
In some cases certainly your cache is in fact a copyright violation.
Interesting, if true. Perhaps you could provide a citation?
show me where I licensed your cache to store (copy) my photographs or that it constitutes fair use. ISPs are not common carriers. A thread on this went by last week with citations from the FCC. I volunteer as plaintiff if one of the lurking attorneys will volunteer to make some case law by suing cache using sites for copyright violation.
WSimpson@UMich.edu Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
-----BEGIN PGP SIGNED MESSAGE----- Dana Hudes wrote:
From: "William Allen Simpson" <wsimpson@greendragon.com>
That seems to follow "not make any sense", but YMMV.
No, you are interfering with my revenue stream by preventing my getting credit for the banner impression.
Your bad business models are not my problem. As I wrote before (and you didn't respond):
Last time I looked, there's no constitutional right that guarantees that you can make money.
right. your customers pay you to get them the packets they asked for and if they want to visit my site and see my content and your cache breaks that, you're not delivering what your customers requested. My site won't deliver content in most of the pages without the ads displaying.
Interesting, if true. For example, I use Netscape. Netscape has long had the wonderful feature that images aren't loaded, unless I hit the Images button. Only rarely do I ever want to see images. Don't forget Opera, iCab, and other fine browsers that automatically filter out various styles of junk. And all of them use a disc and/or memory cache. So, your site won't load, and customers just go on to the next site. Actually, that just means fewer and fewer folks will visit your site, and fewer and fewer folks will buy your hosting.
Interesting, if true. Perhaps you could provide a citation?
show me where I licensed your cache to store (copy) my photographs or that it constitutes fair use.
You don't seem to have answered the question. I believe that you explicitly granted the access to your material, using DNS and BGP and by responding to the HTTP get request. I don't understand how I access it otherwise. Alternatively, I cannot carry your IP packets at all. Therefore, at such time as you tell everyone on NANOG your network numbers carrying your strictly licensed material, we'll be happy to filter your BGP announcements, up to and until you explicitly give us permission to carry them again.
I volunteer as plaintiff if one of the lurking attorneys will volunteer to make some case law by suing cache using sites for copyright violation.
I'll be happy to volunteer as defendant. William Allen Simpson 1384 Fontaine Madison Heights MI 48071 Although I would appreciate waiting a few months, as I already have a case going at 6th Circuit, and I'm an expert witness in another. I really prefer not to handle more than 1 or 2 at a time -- it's too distracting from real life. -----BEGIN PGP SIGNATURE----- Version: PGP 6.5.1 iQCVAwUBOfBkgtm/qMj6R+sxAQG0cAQAm4f4lspVqovVoEs44ZUlPcO3irBIb9Tz a1X4ES/ZLojhSqM+5OV1RKfVo5lO4HML4tyHC5CpSzd4veZgOwi7wBmC6a8ANgti 67rC0AK1J+xSw8761H4PMk+26i22VMXK7p77UXhnOeoe1Q9pxQPAcWM7gtV1lyU1 ShX732xJYYU= =weq1 -----END PGP SIGNATURE-----
I think you have a few misconceptions. ----- Original Message ----- From: "William Allen Simpson" <wsimpson@greendragon.com> To: <nanog@merit.edu> Sent: Friday, October 20, 2000 11:29 AM Subject: Re: decreased caching efficiency?
-----BEGIN PGP SIGNED MESSAGE-----
Dana Hudes wrote:
From: "William Allen Simpson" <wsimpson@greendragon.com>
That seems to follow "not make any sense", but YMMV.
No, you are interfering with my revenue stream by preventing my getting credit for the banner impression.
Your bad business models are not my problem.
As I wrote before (and you didn't respond):
Last time I looked, there's no constitutional right that guarantees that you can make money.
certainly true and the same is applicable to you.
right. your customers pay you to get them the packets they asked for and if they want to visit my site and see my content and your cache breaks that, you're not delivering what your customers requested. My site won't deliver content in most of the pages without the ads displaying.
Interesting, if true.
For example, I use Netscape. Netscape has long had the wonderful feature that images aren't loaded, unless I hit the Images button. Only rarely do I ever want to see images.
Then why the heck would you visit my photo gallery? to read the captions??
Don't forget Opera, iCab, and other fine browsers that automatically filter out various styles of junk.
Hmm. I'll have to check out Opera. If indeed it prevents ads, I'll have to add JavaScript to prevent Opera users from viewing my photos.
And all of them use a disc and/or memory cache.
Memory copies are an interesting secondary issue. There has been litigation on this unrelated to the Internet that held that loading a program into RAM from disc is making a copy and requires a license. The issue was some sort of systems maintenance company which wasn't licensed.
So, your site won't load, and customers just go on to the next site. Actually, that just means fewer and fewer folks will visit your site, and fewer and fewer folks will buy your hosting.
No, you've got it wrong. The ads are on the pages. Apache, at the moment, knows nothing of the ads. Anyone I provide virtual hosting for is paying money and can do any legal thing they want with their content. They want ads, they go make their own deals. This isn't angelfire or yahoo.
Interesting, if true. Perhaps you could provide a citation?
show me where I licensed your cache to store (copy) my photographs or that it constitutes fair use.
You don't seem to have answered the question.
I guess I'll cite the entire copyright law....USC Title 31 isn't it? I'll have to look. But there is always the case law. And even if I gave you case citation you would not interpret it properly.
I believe that you explicitly granted the access to your material, using DNS and BGP and by responding to the HTTP get request. I don't understand how I access it otherwise.
Alternatively, I cannot carry your IP packets at all. Therefore, at such time as you tell everyone on NANOG your network numbers carrying your strictly licensed material, we'll be happy to filter your BGP announcements, up to and until you explicitly give us permission to carry them again..
You are being highly silly as you confuse layer 3 with layer 7 packets for a particular application. why are you on the side of theives? My photo pages are supported by advertising. I am the copyright holder for the photos and the pages. The license to view the photos is dependent on the display of the banner ads. If you view the photos without the ads, absent some other license negotiated with the owner of the copyright, you are stealing from me.
I volunteer as plaintiff if one of the lurking attorneys will volunteer to make some case law by suing cache using sites for copyright violation.
I'll be happy to volunteer as defendant.
William Allen Simpson 1384 Fontaine Madison Heights MI 48071
Although I would appreciate waiting a few months, as I already have a case going at 6th Circuit, and I'm an expert witness in another. I really prefer not to handle more than 1 or 2 at a time -- it's too distracting from real life.
-----BEGIN PGP SIGNATURE----- Version: PGP 6.5.1
iQCVAwUBOfBkgtm/qMj6R+sxAQG0cAQAm4f4lspVqovVoEs44ZUlPcO3irBIb9Tz a1X4ES/ZLojhSqM+5OV1RKfVo5lO4HML4tyHC5CpSzd4veZgOwi7wBmC6a8ANgti 67rC0AK1J+xSw8761H4PMk+26i22VMXK7p77UXhnOeoe1Q9pxQPAcWM7gtV1lyU1 ShX732xJYYU= =weq1 -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Dana Hudes wrote:
My photo pages are supported by advertising. I am the copyright
holder for the photos and the pages. The license to view the photos is dependent on the display of the banner ads. If you view the photos without the ads, absent some other license negotiated with the owner of the copyright, you are stealing
from me.
Funny that this message crossed paths while I was composing one that said: "Does somebody here argue that every consumer should be required to buy every product they see advertised on television, or never watch the shows? (Admittedly, some places it seems that anyone will argue anything anytime.)" I've just viewed your pages. I saw no ads. (I did see places where some text said things like click here and visit my sponsor, but didn't bother looking further.) I loaded 2 images (nymphad.jpg and washmon.jpg) and saved them to disk to look at again and again as often as I wish. You have my address. Use it wisely. -----BEGIN PGP SIGNATURE----- Version: PGP 6.5.1 iQCVAwUBOfB5Jtm/qMj6R+sxAQEEOwP8CNrApMUGOjBvfbGUIUR1bPtMzM+jzVXW ZW2B9erSvwNBREEL+mA81QnqQ3sATQynscF8UYWRSvNPhsr273AOzsXtGnqyBULE m8aZDjWxa2rVsr62a9OVzA5XSVCDs7e98jwGFRL88qALmic6l+jaaZPMEvdz8ATX hzGxdpa3yeA= =rWMz -----END PGP SIGNATURE-----
Hey gang, This discussion is so unbelievably out of control.. ;) At least we're still true to the good old NANOG tradition of scope creep. Why are we discussing manipulating content instead of just transport caching? Can we please stop this non-sense? Including the myriads of pseudo copyright lawyers on this list, and the large number of falsehoods they portray as facts... GO AWAY. A properly working cache should not ever deliver the page differently to the recipient than the originator intended. If it does, it's by definition not a cache. Cache's do not alter, they just.. well,.. cache. Everything else is not caching. Thanks, Chris -- Christian Kuhtz Architecture, BellSouth.net <ck@arch.bellsouth.net> -wk, <ck@gnu.org> -hm Atlanta, GA "Speaking for myself only."
On Fri, Oct 20, 2000 at 10:24:27AM -0400, Dana Hudes wrote:
No, you are interfering with my revenue stream by preventing my getting credit for the banner impression.
Tough. Banner ads aren't a guaranteed form of revenue. How would you feel if I said my cache at home filters banner content out? You do not have a guaranteed right to spew advertisements. If banner revenue is a large part of your revenue model, I think you need to consider revising it.
right. your customers pay you to get them the packets they asked for and if they want to visit my site and see my content and your cache breaks that, you're not delivering what your customers requested. My site won't deliver content in most of the pages without the ads displaying.
Not having a site that is cache friendly is the equivalent of not having a site that works with Netscape|IE|etc. I don't see how your content is the responsibility of anyone else but you.
At the moment there is a timeout built in while I wait for the ad network to increase server capacity to meet demand. It will go away.
Ahh, so your page is disgustingly slow, and you want to keep it that way. --msa
----- Original Message ----- From: "Majdi S. Abbas" <msa@samurai.sfo.dead-dog.com> To: "Dana Hudes" <dhudes@hudes.org> Cc: <nanog@merit.edu> Sent: Friday, October 20, 2000 12:33 PM Subject: Re: decreased caching efficiency?
On Fri, Oct 20, 2000 at 10:24:27AM -0400, Dana Hudes wrote:
No, you are interfering with my revenue stream by preventing my getting credit for the banner impression.
Tough. Banner ads aren't a guaranteed form of revenue.
Neither is being an ISP a guarantee of revenue. I have a guarantee that if an ad displays my account is credited. Its contractual. Not even electronic contract, paper executed by both parties. I dare say the only ones sure to make profit in this are Cisco/Juniper/Nortel/Lucent/Foundry et al (each to different levels based on products ). without their equipment none of us have any revenue.
How would you feel if I said my cache at home filters banner content out?
I hope my JavaScript would detect this and refuse to display the photograph.
You do not have a guaranteed right to spew advertisements.
Yes I do. I don't have rights to spew ads on other peoples content, but my content is mine. I can have the page refuse to display without the ads if I so choose.
If banner revenue is a large part of your revenue model, I think you need to consider revising it.
right. your customers pay you to get them the packets they asked for and if they want to visit my site and see my content and your cache breaks that, you're not delivering what your customers requested. My site won't deliver content in most of the pages without the ads displaying.
Not having a site that is cache friendly is the equivalent of not having a site that works with Netscape|IE|etc. I don't see how your content is the responsibility of anyone else but you.
At the moment there is a timeout built in while I wait for the ad network to increase server capacity to meet demand. It will go away.
Ahh, so your page is disgustingly slow, and you want to keep it that way.
No, it doesn't delay unless the ads are delayed. If the ads come right up the photo loads right away. Otherwise (due to JavaScript issues that prevent use of an event-driven model) the script checks frequently to see if the ads are up. People with slow computers that have current browsers and o/s with inadequate RAM experience slow display. My content is my property not yours.
--msa
[ On Friday, October 20, 2000 at 12:43:51 (-0400), Dana Hudes wrote: ]
Subject: Re: decreased caching efficiency?
Neither is being an ISP a guarantee of revenue.
Not if you're an idiot and can't measure input vs. output and figure out how to charge more for what you deliver than what you buy from your upstream, no; but if not well then you've got to work pretty hard *not* to have good revenue, if not a profit too!
I have a guarantee that if an ad displays my account is credited. Its contractual. Not even electronic contract, paper executed by both parties.
Cool. Good for you. Now you need to learn just how to make sure that you can count those display events. I suspect your attitude is doing more to prevent this possiblity than to ensure it. Are you really sure you want to be turning people away from the content that they've sought out just because your banner advert spam blocks the content until it's displayed? Eg. if your medium were print would you really want all of the other pages in the publication to appear blank before your advert was viewed? Get Real! You'd be laughed out of town forever! This is a stupid discussion, and not suitable for NANOG. Please take it elsewhere! -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
--- Jason Slagle - CCNA - CCDA Network Administrator - Toledo Internet Access - Toledo Ohio - raistlin@tacorp.net - jslagle@toledolink.com - WHOIS JS10172 -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GE d-- s:+ a-- C++ UL+++ P--- L+++ E- W- N+ o-- K- w--- O M- V PS+ PE+++ Y+ PGP t+ 5 X+ R tv+ b+ DI+ D G e+ h! r++ y+ ------END GEEK CODE BLOCK------ On Fri, 20 Oct 2000, Dana Hudes wrote:
Yes I do. I don't have rights to spew ads on other peoples content, but my content is mine. I can have the page refuse to display without the ads if I so choose.
Sure, you have the right so spew them on your page, but I have the right to not view them if I don't want to see them. I looked at your page and no where do I see a mention of the fact I HAVE to view the ad to see the picture. I see a request that I do so in the statusbar.
No, it doesn't delay unless the ads are delayed. If the ads come right up the photo loads right away. Otherwise (due to JavaScript issues that prevent use of an event-driven model) the script checks frequently to see if the ads are up. People with slow computers that have current browsers and o/s with inadequate RAM experience slow display.
My content is my property not yours.
Not arguing that. More and more frequently there has been this drive for cheap bandwidth to the homes. The margins on cablemodem/DSL from what I can tell are VERY low and caching plays a large part in making it possible from the models I've ran (We are not yet into broadband, but should be within a month or 3). Are you willing to pay a premium for uncached service? I'll surely provide it to you if you are. But, when you want 1.5mbs down for $40 a month, you can pretty much guarntee that I'm going to do whatever I can to limit the amount of traffic to you that leaves my network. Jason
William Allen Simpson wrote:
Dana Hudes wrote:
I vehemently disagree with the statement that impressions do not make any sense, only clickthroughs. There is such a thing as brand awareness, a situation where a banner ad is good for itself even if it doesn't lead to click through.
Of course, in that case, the benefit is to the advertiser. That is, they get the benefit, but you don't get paid. Not my problem.
So, when Sears buys a full page advertisement in the Boston Globe, they should get that for free, and only have to pay the Globe when someone shows up at the door holding a copy of the ad? Sears gains brand recognition through its ads in newspapers, on TV, and on any other medium it employs. Banner advertising impressions serve exactly this purpose. For example, go to Yahoo and do a search on Honda automobiles. Do you think it's a coincidence that the banner ad that shows is a Honda ad? Do you think Yahoo only gets paid for that ad if someone clicks through it? What is it about this medium which you think makes it so different from any other?
That seems to follow "not make any sense", but YMMV.
It is NOT for YOU to decide what business model makes sense for MY business relationship with MY advertisers.
Nope. You can have any business relationship you'd like. But, by the same token, it is not for *YOU* to decide that *I* have to pay to support YOUR business decision.
Last time I looked, there's no constitutional right that guarantees that you can make money.
I wonder if you've told your customers that you do the equivalent of clipping all the ads out of the newspaper you deliver to them? I don't know that the courts will consider your arguments to be valid. I do expect it'll be the courts that settle this. ISPs claim the right to decide what traffic to pass over their wires. At the point at which you start selectively editing that content (which is essentially what's happening when you alter the content in some fashion) you may well get yourself in trouble later.
I pay my ISP to carry IP packets around.
But, you don't pay ME to carry your IP packets around. My customers pay me. I pay my upstream. Therefore, I pay my upstream as little as possible.
And you've informed them that you filter and modify the information they look at over the Internet, right? I don't recall seeing that in the terms of service for the providers I use. -- ----------------------------------------------------------------- Daniel Senie dts@senie.com Amaranth Networks Inc. http://www.amaranth.com
On Fri, Oct 20, 2000 at 10:47:35AM -0400, Daniel Senie wrote:
William Allen Simpson wrote:
Dana Hudes wrote:
I vehemently disagree with the statement that impressions do not make any sense, only clickthroughs. There is such a thing as brand awareness, a situation where a banner ad is good for itself even if it doesn't lead to click through.
Of course, in that case, the benefit is to the advertiser. That is, they get the benefit, but you don't get paid. Not my problem.
So, when Sears buys a full page advertisement in the Boston Globe, they should get that for free, and only have to pay the Globe when someone shows up at the door holding a copy of the ad?
Try not to compare apples and oranges. If your assumptions are bad, that doesn't lend it for your conclusions to be accurate. ;-) (I'd love to see a serious non-flame response from the content junkies to address some of the verbiage below written by me, a service provider junkie). The trucking companies aren't sponsoring Sears ad's or the Boston Globe by shuttling around the Globe issue (including the printed ad) for free. In fact, as long as when a customer buys them (or they are pre-paid delivered to their doorstep) the issues look exactly like they came from when the Boston Globe printed them. Why do you care how the trucking company schedules its trucks, or what types of trucks it owns? It's irrelevant.. as long as... All the Boston Globe and shop owners care about is that they meet whatever SLA allows them to sell the paper at a well-defined time in the morning. Sure there's a problem with caches breaking content. On the other hand, paying your hometown ISP doesn't give you guaranteed world-wide rights towards an SLA. SLAs cost service providers money to enforce. So, how do you provide settlement between your implicit expectation and the myriads of different contracts people around the world have with their subscribers? Are you sure you *REALLY* want to be billed for the cost you might incur? Caches are a response to the lack of settlement, and the recognition that as long as site owners provide reasonable assumptions, caches will do reasonably well following those assumption. If a website is designed loosy, caches will do poorly and so on. Caches are in fact a reflection of the lack of settlement, or rather a totally different settlement mechanism in the Internet. Ever heard of the phrase "there is no such thing as a free lunch"? Think again. Just for a minute. Somebody has to pay for the pipe at some point. So, if you want guaranteed delivery, I'm sure no SP would ever turn you down if you wanted to subsidize the infrastructure your guaranteed delivery is occuring on by reimbursing them for their costs (or at least contribute). The Internet is not like other media in that there's not neccessarily the "cost causer cost" principle at work. In fact, a lot of times you get what looks like a free ride. And you get exactly what you pay for. If you want something else, *PAY* for a CDN.
Sears gains brand recognition through its ads in newspapers, on TV, and on any other medium it employs.
Sure it does. Are you going to tell a broadcasting station that it can't MPEG compress your ad as it is broadcast to be transmitted more efficiently over a satellite link or other digital distribution network? No, obivously not, because it is in your best interest to allow them to reach the furthest possible audience. That's what SPs do. It's a means for them to control the explosion in bw that they have to carry with no or little additional revenue dollars coming in to pay for it.
Banner advertising impressions serve exactly this purpose.
Except that you're stirring up all sorts of false assumptions about the similiarity between these other media and the Internet.
For example, go to Yahoo and do a search on Honda automobiles. Do you think it's a coincidence that the banner ad that shows is a Honda ad? Do you think Yahoo only gets paid for that ad if someone clicks through it? What is it about this medium which you think makes it so different from any other?
See above. (may I note that Y! stock just got tanked despite beating earnings because of among other things the fact that "click thru revenues" and "banner ads" in general are having less and less of an affect on users.) So, how does Y! pay the entire world for carrying it's traffic? Somewhere in your math there's a flaw as to who gets to pay for what. And only if you close that missing link, you'll understand why there's a difference, particularly who pays for bw on the Internet.
That seems to follow "not make any sense", but YMMV.
It is NOT for YOU to decide what business model makes sense for MY business relationship with MY advertisers.
Nope. You can have any business relationship you'd like. But, by the same token, it is not for *YOU* to decide that *I* have to pay to support YOUR business decision.
Last time I looked, there's no constitutional right that guarantees that you can make money.
I wonder if you've told your customers that you do the equivalent of clipping all the ads out of the newspaper you deliver to them? I don't know that the courts will consider your arguments to be valid.
This is beside the point of caching, but.. My $.02, quite frankly, if a newspaper had all blank pages where there are ads now, I would not mind at all. The cutouts are ugly (and are implying in your assumption that they make the product cumbersome to use, which is a false assumption because lack of banner ads does not make a wbepage cumbersome to use). But, say a printer (because the contract w/ Boston Globe specified no reimbursement relative to the number of pages) decides to not print the ads because it saves him ink. It's a free market, you can find a different printer and make sure the contract specifies he must print ads. You bet your supply chain will notice the difference in weight and start billing you appropriately. Point is, THERE IS NO FREE LUNCH. Why are you *STILL* expecting one. Or, better yet, the printer uses a different printing process for reoccuring ads to be more efficient. Contract specifies he can or doesn't explicitly prohibit that. Or, better yet, the customer who buys the paper clips the ads out himself or pays somebody to do so after they purchase the paper. Are you going to sue him, too, and why should you have an right to demand what happens with the article once I purchase it? So, how would you feel about an ISP who offers their customers junkbuster service to blow away all ads upon request by their customers, saving everybody bandwidth? In fact, one might argue, once you dump the packet on a wire payed for by somebody else, you relinquished ownership of the content. How would you like that? Isn't caching or other types of content distribution networks a much better approach all of a sudden? One that allows everybody to live and prosper rather than one living off the other?
I do expect it'll be the courts that settle this.
Sure, and don't expect them to follow your logic. The logic I just described looks awfully good to the various legal departments, and you can bet you will have just about any communications provider of any kind, land, terrestrial wireless, satellite etc countersuing.
ISPs claim the right to decide what traffic to pass over their wires.
And why is that wrong since they're paying for it? ISPs are not common carriers.
At the point at which you start selectively editing that content (which is essentially what's happening when you alter the content in some fashion) you may well get yourself in trouble later.
Well, not so fast. Perhaps I gave options to my subscriber and my subscriber said he wanted to be cached, or have his ads junkbusted.
I pay my ISP to carry IP packets around.
But, you don't pay ME to carry your IP packets around. My customers pay me. I pay my upstream. Therefore, I pay my upstream as little as possible.
And you've informed them that you filter and modify the information they look at over the Internet, right? I don't recall seeing that in the terms of service for the providers I use.
Woah. Where again did it *guarantee* delivery or a specific SLA? Do you also want to have the terms under which rerouting happens in a SPs network in the customer contract? I think not. That's like stating that a trucking company can only use Mack trucks to deliver the Boston Globe. If you push it that far, your content may not get all that far because people may flat turn you down. Sure, subscribers may complain, but they will scream even more if you charge them for the additional cost they incur. Remember now, as a subscriber you pay for bandwidth plus whatever other provisions the contract may have. That bandwidth creates cost somewhere. And you can bet that if you torpedo a SPs business model by generating more cost for them by generating disproportionate bw amounts which cannot be comp'ed by the subscriber revenues, you will get cancelled, the contract change or whatever else to make it fit again. Bottom line, it will have a consequence because the SP hardly will let you put them out of business. There is no free lunch. -- Christian Kuhtz Architecture, BellSouth.net <ck@arch.bellsouth.net> -wk, <ck@gnu.org> -hm Atlanta, GA "Speaking for myself only."
A couple more points.. On Fri, Oct 20, 2000 at 11:31:29AM -0400, Christian Kuhtz wrote:
And why is that wrong since they're paying for it? ISPs are not common carriers.
One more point, even a common carrier can offer a service to its subscribers to block certain content upon their request (various number blocking services, such as BellSouth's PrivacyDirector which sends blocked/undelivered caller #/id to an IVR to identify themselves, after which the customer gets prompted whether to accept the call.) Perfectly legal. And I suspect you'll see lots more of them as customers ask for them. In fact, this whole debate reminds me of when spam first came around. And some customers are most definitely willing to financially recognize it if you reduce the email spam they got. Personally, banner ads are just that: spam. But that's beside the point.
Woah. Where again did it *guarantee* delivery or a specific SLA? Do you also want to have the terms under which rerouting happens in a SPs network in the customer contract? I think not.
Hmm, ISPs live by oversubscribing their service. Phone companies do, too. So, something has to give. -- Christian Kuhtz Architecture, BellSouth.net <ck@arch.bellsouth.net> -wk, <ck@gnu.org> -hm Atlanta, GA "Speaking for myself only."
Daniel Senie wrote:
So, when Sears buys a full page advertisement in the Boston Globe, they should get that for free, and only have to pay the Globe when someone shows up at the door holding a copy of the ad?
Sears gains brand recognition through its ads in newspapers, on TV, and on any other medium it employs.
Banner advertising impressions serve exactly this purpose.
I cannot tell to whom you are replying, as your argument supports mine. Yep, I agree. Sears pays a fixed price to the Globe. It doesn't vary depending on how many folks buy that paper that day, and how many others happen to pass it along, or leave it on the train to be picked up by others. OTOH, measuring banner "impressions" isn't like that at all. A bad business model based on incorrect delivery assumptions. Meanwhile, some folks seem to have the idea that banner ads pay for content, and you shouldn't look at the content unless you also look at the ads. I'll remind folks that ads don't pay for anything. Consumer purchases pay. Does somebody here argue that every consumer should be required to buy every product they see advertised on television, or never watch the shows? (Admittedly, some places it seems that anyone will argue anything anytime.) I routinely throw away newspaper advertising circulars. I routinely tape TV shows, and skip the ads. I routinely read Wired (Washington Post, etc, etc) with images and cookies turned off. Now that we have beaten the philosophical part to death, could we please return to how to fix the topic of this thread: "decreased caching efficiency?" That's a technical problem! WSimpson@UMich.edu Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32
At 09:44 10/20/00 -0400, Dana Hudes wrote:
I vehemently disagree with the statement that impressions do not make any sense, only clickthroughs. There is such a thing as brand awareness, a situation where a banner ad is good for itself even if it doesn't lead to click through.
It is NOT for YOU to decide what business model makes sense for MY business relationship with MY advertisers.
I think the point being raised was that that business model is, by definition, based on a flawed assumption.
I pay my ISP to carry IP packets around. Caching is acceptable in some cases but not in others. In some cases certainly your cache is in fact a copyright violation.
And browsers implementing caches in memory or disk are also causing copyright violation. Yet for the most part browser caches are considered a Good Thing.
In general I do not want people to have my photos stored in their browser cache (much less permanently saved). I do actually have plans to change around some things in my site to take advantage of browser and network caching (e.g. putting the style sheet in a separate file, ditto the JavaScript and any other constant information I can).
And that saves practically nothing, given that they're very small files. Your choice, of course.
When I switch to CGI-based delivery of images the cache will of course become pass-through since there will be no file to cache just a stream of bytes....
Is the assumption there that by using CGI you'll automatically tweak a configuration in a caching proxy? If so then it's a flawed assumption. Having had a very quick look at your site, it seems a little strange that you want to defeat caching of those objects that soak up bandwidth; the request to perform "click-through" on the advert suggests that you're using the revenue to pay for your bandwidth costs. (So, one assumes that the more the material was cached, the less you'd have to pay, and the less you'd have to worry about page impressions.) I particularly like the way that you require my browser to send a Referer field to be allowed to view the pictures ;-) Nice photos though...
Ian ----- Original Message ----- From: "Ian Cooper" <icooper@equinix.com> To: "Dana Hudes" <dhudes@hudes.org> Cc: <nanog@merit.edu> Sent: Friday, October 20, 2000 10:33 AM Subject: Re: decreased caching efficiency?
At 09:44 10/20/00 -0400, Dana Hudes wrote:
And browsers implementing caches in memory or disk are also causing copyright violation. Yet for the most part browser caches are considered a Good Thing.
No and yes. The user has a license to display the image on their screen for as long as they keep the browser window open on the page . Browser cache is good for user because they may scroll or resize the window and don't have to fetch.
In general I do not want people to have my photos stored in their browser cache (much less permanently saved). I do actually have plans to change around some things in my site to take advantage of browser and network caching (e.g. putting the style sheet in a separate file, ditto the JavaScript and any other constant information I can).
And that saves practically nothing, given that they're very small files. Your choice, of course.
True. The hope is somehow though to save processing the files not just transferring.
When I switch to CGI-based delivery of images the cache will of course become pass-through since there will be no file to cache just a stream of bytes....
Is the assumption there that by using CGI you'll automatically tweak a configuration in a caching proxy? If so then it's a flawed assumption.
But there is no file to cache? I don't have enough gear to set up a test with squid myself (and that would only be one cache) but how is the engine to know to cache it? My understanding is that CGI-generated content is usually not cached.
Having had a very quick look at your site, it seems a little strange that you want to defeat caching of those objects that soak up bandwidth; the request to perform "click-through" on the advert suggests that you're using the revenue to pay for your bandwidth costs. (So, one assumes that the more the material was cached, the less you'd have to pay, and the less you'd have to worry about page impressions.) I particularly like the way that you require my browser to send a Referer field to be allowed to view the pictures ;-)
I do indeed use the revenue to pay for bandwidth but the pictures, by and large (its a work in progress) have been tuned for file size; still takes time to decompress but hey, what can I do. Also the projected load vs. the bandwidth is such that I have a LOT more room left. The users get a reasaonbly large bitmap in a reasonably small file. ImageMagick is nifty set of programs. The problem I have is pirates who collect images and use them for other purposes. the pictures...well, I actually don't want them hanging around on the user's disk once the browser is no longer on the page. I haven't figured out how to make that happen other than expiration of 1 minute or something. I'm working on a system for managing/publishing photo web sites like mine. It will be released as free software when I get it all working satisfactory and maybe I'll write some documentation.
Nice photos though...
Thanks. You do point out that while I pay fixed cost for bandwidth (my server is behind a DSL circuit) others might use the technology to host where they pay for usage as it occurs. An quandary.
I do indeed use the revenue to pay for bandwidth but the pictures, by and large (its a work in progress) have been tuned for file size; still takes time to decompress but hey, what can I do. Also the projected load vs. the bandwidth is such that I have a LOT more room left. The users get a reasaonbly large bitmap in a reasonably small file. ImageMagick is nifty set of programs. The problem I have is pirates who collect images and use them for other purposes. the pictures...well, I actually don't want them hanging around on the user's disk once the browser is no longer on the page. I haven't figured out how to make that happen other than expiration of 1 minute or something.
Dana, isn't there a HUGE difference between piracy and transient storage? Intent is one.
You do point out that while I pay fixed cost for bandwidth (my server is behind a DSL circuit) others might use the technology to host where they pay for usage as it occurs. An quandary.
A quandry to which whoever pays can respond accordingly. Anything from optimizing delivery of your site to cutting it off completely. Sure, that's an extreme, but don't you agree? -- Christian Kuhtz Architecture, BellSouth.net <ck@arch.bellsouth.net> -wk, <ck@gnu.org> -hm Atlanta, GA "Speaking for myself only."
On Fri, Oct 20, 2000, Dana Hudes wrote:
When I switch to CGI-based delivery of images the cache will of course become pass-through since there will be no file to cache just a stream of bytes....
Is the assumption there that by using CGI you'll automatically tweak a configuration in a caching proxy? If so then it's a flawed assumption.
But there is no file to cache? I don't have enough gear to set up a test with squid myself (and that would only be one cache) but how is the engine to know to cache it? My understanding is that CGI-generated content is usually not cached.
BZZT. Another assumption which is actually totally not true. For example, imagine your photo book. The photos won't change, right ? The position in your database won't change, right ? So .. http://www.domain.com/photos?id=31765 ok. 31765 is a static image that won't change. So, you'd be better off setting its expiry time to something high, wouldn't you?
Having had a very quick look at your site, it seems a little strange that you want to defeat caching of those objects that soak up bandwidth; the request to perform "click-through" on the advert suggests that you're using the revenue to pay for your bandwidth costs. (So, one assumes that the more the material was cached, the less you'd have to pay, and the less you'd have to worry about page impressions.) I particularly like the way that you require my browser to send a Referer field to be allowed to view the pictures ;-)
I do indeed use the revenue to pay for bandwidth but the pictures, by and large (its a work in progress) have been tuned for file size; still takes time to decompress but hey, what can I do. Also the projected load vs. the bandwidth is such that I have a LOT more room left. The users get a reasaonbly large bitmap in a reasonably small file. ImageMagick is nifty set of programs. The problem I have is pirates who collect images and use them for other purposes. the pictures...well, I actually don't want them hanging around on the user's disk once the browser is no longer on the page. I haven't figured out how to make that happen other than expiration of 1 minute or something.
You can't. End of story. This is the internet, people control their end-nodes, so you have zero chance of this happening. If you *REALLY* want to be evil, you wrap the images in a java applet so they can't just rightclick on it, but again that won't stop the smart people. Adrian -- Adrian Chadd The Law of Software Development and <adrian@creative.net.au> Envelopment at MIT: "Every program in development at MIT expands until it can read mail."
On Sat, 21 Oct 2000, Adrian Chadd wrote:
But there is no file to cache? I don't have enough gear to set up a test with squid myself (and that would only be one cache) but how is the engine to know to cache it? My understanding is that CGI-generated content is usually not cached.
BZZT. Another assumption which is actually totally not true. For example, imagine your photo book. The photos won't change, right ? The position in your database won't change, right ? So ..
http://www.domain.com/photos?id=31765
ok. 31765 is a static image that won't change. So, you'd be better off setting its expiry time to something high, wouldn't you?
I think the problem with caches and proxies is that the occasions where they offer you an object out of the cache when you wish they didn't are much more noticable then when everything's the way it should. Authors of CGI programs have coughed up many a skull because some cache somewhere was making them think there was a bug somewhere in their scriptery while it was in fact their browser/vendorproxy smoking crack. I seem to recall certain versions of IE, for example, who seem to automatically assume that the contents served by http://www.domain.com/rolldice.cgi?bet=30 will not change the second time you request the page, at least some of the times. I can imagine that this irritates these webdevelopers so much that they have grown a solid loathing for anything that tries to interfere with the traffic between their scriptics and the browser. Personally, I think that if you have a CGI that offers the same result on the same query every single time and has a limited dataset (like in the example of a photo album) you may be much better off to move towards a solution on top of the regular filesystem. So instead of going to http://www.isp.com/photogallery.cgi?person=pi&index=1 You go to http://www.isp.com/photogallery/pi/1 This saves you precious fork()s and allows for all the spiffy performance tweaks in both the OS and the webserver to optimize throughput without hogging your CPU with Wallware on every request. The toughest thing to learn about programming is when not to do it, I guess:).
You can't. End of story. This is the internet, people control their end-nodes, so you have zero chance of this happening. If you *REALLY* want to be evil, you wrap the images in a java applet so they can't just rightclick on it, but again that won't stop the smart people.
I propose we shoot them. Pi -- A mouse is a device used to focus xterms.
Several folks have made sweeping statements that website owners/designers must make changes to live with caches, or their sites will suffer. One problem with that statement is, there are LOTS of caches and each has its own idiosyncracies. Further, they can be locally tuned. Bill Simpson already stated publicly that he tunes his caches to ignore anything that says it isn't to be cached, for example. The whole mess reminds me a great deal of route filtering. Someone, somewhere might filter your route from their BGP feed. This might cause a large number of people to have no access to content provided by a server farm somewhere. How's the server owner (or farm owner) to know? They don't, until some prospective user sends a note that a site is "always down." Technologies which mess with the end-to-end nature of the 'net, and especially which are transparent to end users and far from content providers, invariably make the 'net less useful and less reliable. -- ----------------------------------------------------------------- Daniel Senie dts@senie.com Amaranth Networks Inc. http://www.amaranth.com
[ On Friday, October 20, 2000 at 09:44:37 (-0400), Dana Hudes wrote: ]
Subject: Re: decreased caching efficiency?
I pay my ISP to carry IP packets around. Caching is acceptable in some cases but not in others.
You control how your information is distributed. If you do not wish it to be cached in any meaningful manner by an ISP or anyone else then you must use protocols that are inherently uncachable, such as those protected by strong cryptography.
In some cases certainly your cache is in fact a copyright violation.
No, it's absolutely NOT. You knowingly publish your content in a medium where copying is an inherent part of the system (at many levels!). If you do not accept that arbitrary copies will be made of your data then must not publish your information on the Internet. Period. No amount of ignorance of the medium is a valid excuse in this day and age. The only way you can prevent unabashed private copying is to enter into explicit contracts with those you securely distribute your own copies to. IANAL, but I have studied Copyright Law and I'm reasonably certain that it can only prevent people from making explict commercial use of your works -- it cannot even cause an ISP to share any percentage of the monetary gain in bandwidth savings with you since that's not how copyright law defines commercial use. Network operators will do whatever's necessary to optimise their networks. Transparent caching, where it makes sense and is possible, will happen regardless of what content providors wish. If I can sell even 10% more bandwidth than I buy due to caching then you can bet your boots I'm going to use it! -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
I don't think any one has mentioned this yet, but I think that AOL got a section added to the DMCA that immunizes ISPs/NSPs from the copying that occurs in the operation of the network. AOL, of course, does lots of caching. A content person would have a *very* steep hill to climb in trying to argue that an ISPs decision in how to operate their network is somehow beyond the scope of this immunization. ----------------- Brian Curnow ----------------
"Greg A. Woods" wrote:
[ On Friday, October 20, 2000 at 09:44:37 (-0400), Dana Hudes wrote: ]
Subject: Re: decreased caching efficiency?
I pay my ISP to carry IP packets around. Caching is acceptable in some cases but not in others.
You control how your information is distributed. If you do not wish it to be cached in any meaningful manner by an ISP or anyone else then you must use protocols that are inherently uncachable, such as those protected by strong cryptography.
In some cases certainly your cache is in fact a copyright violation.
No, it's absolutely NOT. You knowingly publish your content in a medium where copying is an inherent part of the system (at many levels!).
If you do not accept that arbitrary copies will be made of your data then must not publish your information on the Internet. Period. No amount of ignorance of the medium is a valid excuse in this day and age.
<snip> For what it's worth the Digitial Millenium Copyright Act (DMCA) explicitly EXEMPTS caching as a violation of copyright. (This only holds for music and maybe video, but it is a clear precedent.)
Greg A. Woods
+1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
Regards Marshall Eubanks Multicast Technologies, Inc. 10301 Democracy Lane, Suite 201 Fairfax, Virginia 22030 Phone : 703-293-9624 Fax : 703-293-9609 e-mail : tme@on-the-i.com http://www.on-the-i.com
On Thu, Oct 19, 2000, John Fraizer wrote:
On this end of the pond, many of us measure success with MRTG and can very accurately plot profits using the same OIDs. You are absolutely right. When you're in the business of selling bandwidth, that is what you are interested in. Not being on the other end of a sat link helps I suppose.
John, I agree, but I would have thought that web providers would have oved the idea of making more money off less equipment. :-) This is why I believe CDN type technology is proving to be more popular than the original forward-caches. I fear though that more and more content will end up defeating even local caches on small links, and THIS is a big problem in my eyes. Solve that, and the problem in the original email is solved .. Adrian -- Adrian Chadd "It was then that I knew that I wouldn't <adrian@creative.net.au> die, as a doctor wouldn't fart in front of a dying boy." -- Angela's Ashes
On Fri, 20 Oct 2000, Adrian Chadd wrote:
On Thu, Oct 19, 2000, John Fraizer wrote:
On this end of the pond, many of us measure success with MRTG and can very accurately plot profits using the same OIDs. You are absolutely right. When you're in the business of selling bandwidth, that is what you are interested in. Not being on the other end of a sat link helps I suppose.
John,
I agree, but I would have thought that web providers would have oved the idea of making more money off less equipment. :-)
How do I make more money when someone upstream from me caches information from my clients webservers and subsequent requests never make it to my clients servers and therefore my clients servers never reply? Bandwidth is NOT a problem for us. It sucks that it is in some parts of the world but it ISN'T here. We'll lay more fiber when we need more pipe. It's that simple. --- John Fraizer EnterZone, Inc.
[ On Saturday, October 21, 2000 at 01:47:15 (-0400), John Fraizer wrote: ]
Subject: Re: decreased caching efficiency?
How do I make more money when someone upstream from me caches information from my clients webservers and subsequent requests never make it to my clients servers and therefore my clients servers never reply?
Well now, since that's the core secret to making money on the WWW you don't expect us to all blurt it out at once now, do you? :-) Note that caching generates savings, not profits! However as a web server operator who doesn't have to pay anything but understanding and co-operation for those savings, the potential is there to turn them all directly into profits. It's just a matter of how you look at things and how you understand what you're selling. There's already been ample discussion here about how one can use the WWW properly while still allowing commercial accounting to take place. Even if you don't explicitly co-operate with caches you still need to be aware of them. If a WWW business model doesn't take caching into account then it damn well better not expect to reach anything even approaching the entire Internet. North America is probably the only place on the Internet where caching isn't more or less a necessity, and it's no doubt soon going to lose its status as one of the largest parts of the Internet, if it hasn't already (which may in fact turn it into one of the places where hierarchical caching is a necessity too!). -- Greg A. Woods +1 416 218-0098 VE3TCP <gwoods@acm.org> <robohack!woods> Planix, Inc. <woods@planix.com>; Secrets of the Weird <woods@weird.com>
Hi Christian, At 10:29 AM 19/10/2000 -0400, Christian Kuhtz wrote:
Has anyone else around here noticed a decrease in caching efficiency over say, the past year or so? Seems we've seen a radical drop (order of magnitude). Seems popular sites are using more and more entirely dynamic, rapidly changing content..
we haven't observed any real drop in overall hit-rates. providing you have sufficient disk storage provisioned with your cache, you should still be seeing 35%+ byte-hit-rates providing you have a sufficient target customer base to populate the cache. if you do have 'significant' customer-base, then a byte hit-rate around 50% isn't uncommon.
If there's indeed a reduction in efficiency, caches simply introduce more transactional latency and provide no benefit to offset cost. What do people consider reasons to be to keep caching in the network? Have caching infrastructures materialized as starting points for content distribution, or have you guys ultimately rebuilt your infrastructure to serve that specific purpose?
caches exist for multiple reasons -- [1] to make things faster [2] to save bandwidth [3] to achieve more "goodput" in network transactions. [4] to operate at layers-8 and 9 (filtering) in terms of latency, you might want to look at what your caching product does on accepting a connection. many vendors' products initiate a DNS lookup in addition to that of what the user's web browser DNS lookup. if that is the case, ensuring that the user's DNS lookups go to the same caching nameservers as your caches could be a worthwhile thing. of course, i might argue that a transparent cache shouldn't need to hold back a http request when it already knows the dst-ip-address that the flow was destined to go to, but then again, i might be considered biased. in many cases, people significantly underestimate the effect of #3 - and it isn't easily measured. it is the effect of a "good" tcp stack cutting down end-to-end tcp retransmissions when the "last mile" hop is congested. no comments on #4 .. probably doesn't apply to the US anyhow. cheers, lincoln.
participants (16)
-
Adrian Chadd
-
Andrew Bangs
-
bcurnow
-
Christian Kuhtz
-
Dana Hudes
-
Daniel Senie
-
Ian Cooper
-
Jason Slagle
-
John Fraizer
-
Lincoln Dale
-
Majdi S. Abbas
-
Marshall Eubanks
-
Patrick Greenwell
-
Pim van Riezen
-
William Allen Simpson
-
woods@weird.com