It Doesn't Pay to be Popular
by Glenn Fleishman05/30/2003
It rarely pays to be popular, unless you're a film star or an amusement park. On the Internet, popularity and cost often go hand in hand. The more traffic you get (unless it's tightly bound to sales), the more money it costs you. This is because most bandwidth usage is centralized: one server and one network feeds all of the hungry mouths, and you pay the piper for spikes and sustained use.
In mid-March, I narrowly averted a $15,000 bandwidth bill that arose when I offered Real World Adobe GoLive 6 as a free PDF. The site at which I hosted the download is a Level 3 Communications co-location customer, and Level 3 charges on sustained bandwidth.
The book had 10,000 downloads, representing nearly 250 gigabytes, in just 36 hours. I averted any cost because Level 3 drops the top five percent of the most busiest hours each month, which is just over 36 hours in an average month. My 36th hour was a few megabits per second, it turned out; my 37th or so, below 500Kbps.
If I were living in the future, the scenario might have been less harrowing. Systems that involve peer-to-peer file sharing -- which distribute parts or entire files across a whole system to reduce strain on any one part -- and edge server-to-edge server networks, like Akamai's system of pushing files topologically closer to downloaders, would make the effective cost of bandwidth lower by never straining single locations.
I learned a lot in the process of coping with the stress of a high-ticket bill and then the aftermath about managing expectations, dealing with high-bandwidth needs, and the true current cost of bandwidth. I also had a chance to reflect on how current distributed file sharing requires either too much work or involves too many unrelated political and legal issues.
(For more on the social costs and the follow-up on what happened, you can read the New York Times article I wrote that appeared in April 2003. The Times requires registration and is now charging for access to the archives after about seven days.)
Hosting and Co-Location: Where Things Stand Now
When I started buying in-house bandwidth at the T-1 level (1.544Mbps dedicated digital service) back in 1994, it cost about $2,000 per month. And that didn't include unlimited bandwidth: after a small number of gigabytes, I was charged $50 for each additional Gb. I, in turn, passed this on to the Web hosting clients.
In the near decade since, bandwidth costs have dropped but not plummeted; you still have to shop around. If you want in-house bandwidth with more than just a few static IP addresses, you'll wind up spending several hundred dollars a month on the low end for 512Kbps DSL to T-1 service in places, like Seattle, with lots of competition. In other parts of the country, with older infrastructure or less competition, a T-1 could still top $2,000 per month.
On top of the actual local loop, or the line that links your network to the ISP or network provider, most companies charge an excess bandwidth fee. I switched my office network from one Seattle-based firm to another, Speakeasy Networks, because the first firm only allowed a few tens of gigabits of traffic a month before charging $30 per gigabyte. (They currently charge only $10 per gigabyte.)
Speakeasy Networks has pursued a more reactive model: they let you eat all the bandwidth you want, and only monitor for abuses, shutting down illegitimate uses of the network, such as warez, porno, or scams.
Most of us want high-availability throughput for our web sites in the single- or double-digit Mbps range without paying for it individually in our homes or offices. We turn to hosting or co-location. Monthly fees cover the basics, including storage on a server or the rack space, electricity, backups (battery and data), and air conditioning. Most hosting and co-location companies set maximum monthly bandwidth usage as part of a level of service and charge by the meg or gig thereafter.
The rates can still vary from reasonable to ridiculous. In researching co-location and web hosting companies to find out the going rates (and to find a location for my book-price comparison site, isbn.nu, which had outgrown my office's 768Kbps SDSL line), I found you could pay anything from $1 to $100 per gigabyte without any good reason for the disparity.
I chose to move isbn.nu to a local co-location and hosting company, digital.forest, which has a nearly 10-year history, an eternity in Internet time. Given my bandwidth near-blowout, their $1 per gigabit after the first 40Gb rate seemed delightful to me. (I also noted that they had redundant fiber optic lines: one running north and one south from their suburban location northeast of Seattle.)
digital.forest offered me a 10Mbps connection, and showed me their current capacity and utilization, all of which made me confident in their ability to deliver.
If I'd chosen to go with Level 3, I could have gotten 100Mbps feed as part of the basic deal, but there's terror that goes with that. Level 3's pricing in Seattle starts under $1,000 for a cage with a sub-1Mbps sustained bandwidth utilization, but costs $1,000 per Mbps above that; in the high Mbps, you start to pay less, and you can contract for higher bottom levels, too.
For many people, shared space on a server with access to ASP, PHP, JSP, MySQL, and other servers and languages is really all that's needed, and hosting instead of a dedicated server can more than suffice. Oddly, though, bandwidth costs tend to be much, much higher for hosting than co-location, when you'd think the reverse would make more sense.
EarthLink, for instance, allows a reasonable amount of bandwidth per month for its dial-up and DSL customers' sites, but cuts you off if you exceed a limit that they don't precisely define. Here's how they explain it:
Each member's free webspace is allocated a certain amount of traffic per month (traffic is calculated on a formula multiplying the number of hits that your site receives by the size of your files). If a site exceeds its maximum monthly allotment of traffic, the site will become unavailable until the beginning of the next calendar month. A site that exceeds the EarthLink Member's maximum allotment in size will also become unavailable. Unavailability includes but may not be limited to the inability to access the site publicly or to publish to or modify the site's contents via certain Web creation tools. More information about appropriate use of the free member webspace appears under Free Webspace Community Guidelines.
Follow the link and you find more details: Each member's free webspace is allocated at least 1GB of traffic per month (traffic is calculated on a formula multiplying the number of hits that your site receives by the size of your files). If a site exceeds its maximum monthly allotment of traffic, the site will become unavailable until the beginning of the next calendar month. A site that exceeds the maximum allowed webspace size will also become unavailable. Unavailability includes but may not be limited to the inability to access the site publicly or to publish to or modify the site's contents via certain web creation tools.
Customers buying a business hosting package, however, need to pay much more heed: their three basic hosting packages include many features for $20 to $85 per month, varying from 10 to 30Gb of bandwidth use included each month. Cross that limit and you start paying 10 cents per Mb. That's right: $100 per Gb!
Apple's .Mac service offers web site hosting, but requires a Mac to manage the account. Once it's set up, files can be uploaded via WebDAV from any platform. Apple declined to provide specific information about how they monitor and limit bandwidth, but they said they encourage all legitimate uses, such as sharing QuickTime movies created in iMovie. They don't charge for bandwidth at any level.
Whether using a hosting service or co-locating a server, you need to ask several critical questions before popularity strikes:
- Can you monitor bandwidth usage via MRTG or another reporting tool?
- Does the ISP cut you off, notify you, or just start charging when you exceed limits?
- How humane -- like EarthLink -- are they about natural bandwidth disasters? EarthLink typically waives charges for legitimate, "accidental" bandwidth overages, such as a too-popular web page.
For instance, Level 3 just lets the bandwidth roll, and offers MRTG monitoring -- using a secure card's one-time number generator to restrict access.
Distributed and P2P Bandwidth
Within a few days after I shut off the bandwidth to feed out the PDF of my book, sympathetic colleagues and strangers offered suggestions for continuing to make it available: distribute it through mirrors, and distribute it through peer-to-peer file sharing networks.
I was able to immediately engage on the former through some generosity. My colleague, friend, and wireless networking book co-author Adam Engst is also a moderator of the Info-Mac Archives, a collection of legal downloadable files, the inception of which goes back to the early 90s at Stanford.
Currently, the archives are hosted at MIT, but files are served through a few dozen mirrors worldwide (mostly at academic institutions, but including AOL and Apple). These collective mirrors probably have the capability to feed a gigabit per second.
Adam suggested uploading the file to the archives. As part of my contribution to their effort, I wrote round-robin Perl scripts that would allow easy random redirection to a given file or directory. The 10,000 downloads of my file, had they occurred through Info-Mac, would have been a tiny distributed blip.
It's not easy for the average individual to have access to this kind of distribution system, but there are plenty of archives like sourceforge.net for information, scripts, and programs (free, demo, or shareware), and there's no reason to host a file that could be placed into a distributed or replicated archive.
A few dozen people suggested peer-to-peer file sharing. Beyond sharing pirated music, P2P networks are an efficient way to use the vast pool of bandwidth available to individuals and reduce the load on any given machine.
Although folks said try Kazaa, LimeWire, and other well-known services, BitTorrent was the name that came up again and again, partly because it works on several platforms, and doesn't require a big installation.
The best-known file sharing systems generally distribute entire files to other machines, or allow people to expose their directories of files, and let the user choose the best source. BitTorrent distributes pieces of files so that given a large pool of downloadable bandwidth, a file might be reassembled from many pieces in many places.
In peer-to-peer systems, however, you can't necessarily be sure that a given file is the same an author meant to upload, that the file has been vetted for viruses, or that each version of the file throughout a network is the same as every other file. BitTorrent uses cryptographic hashing to verify that the file you received was correctly and legitimately reassembled, but it doesn't verify as a system that it's the file that an author or creator intended it to be
Unfortunately, the P2P moniker has gotten a generally bad name, and I would be concerned at suggesting to readers that they run a program which, on certain ISPs, at certain universities, and in certain corporations, could get their accounts cancelled, their butts expelled, or themselves get downsized.
There's also a problem in finding the "target demographic:" folks who are likely to want to download a copy of a book on Adobe GoLive 6 probably have a tiny overlap with people who understand peer-to-peer file sharing software, or have the desire to load it.
More appropriate to this kind of download might be Akamai's model of content distribution, which involves placing servers all over the Internet, even in individual ISPs, to replicate content to specific locations. The bandwidth use is thus minimized inside certain small topological network areas, reducing ISPs costs and the costs of delivering content.
Akamai isn't ad hoc, however, and you have to have a commercial relationship with them. An edge-server-based file distribution system could solve many of the problems with peer-to-peer sharing, legitimize distributed file sharing, and improve speed and availability. But it would require some centralized authority that would verify the legality of uploaded files for distribution and then sign them for verified distribution.
With Apple making electronic music purchases simple, perhaps P2P or edge-to-edge sharing could become workable with the kind of assurances and consistency needed, without circumventing acceptable use policies or copyright.
Conclusions
I wish I'd be smarter in investigating costs before I got started, but my switch to digital.forest certainly allows me some peace of mind: they monitor and notify, and charge so little for bandwidth, that even a blowout won't send me careening off the road.
What I really look forward to is a day when we truly have a pool of international bandwidth and distributing information, especially free information, becomes as simple as checking a box and letting the underlying mechanisms sort out the bandwidth. In that world, no one person gets stuck with the bill, just as no one actually pays for the whole Internet, either.
Glenn Fleishman is a freelance technology journalist contributing regularly to The New York Times, The Seattle Times, Macworld magazine, and InfoWorld. He maintains a wireless weblog at wifinetnews.com.
Return to OpenP2P.com
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 15 of 15.
-
Bittorrent guarantee identical to direct download
2003-08-13 06:15:29 anonymous2 [Reply | View]
you need to just also include an MD5 file so people can check if their file is the real one. search for it on google not too sure whats the easiest way to do it but you see it all the time with big files like linux distros etc.
-
Bittorrent guarantee identical to direct download
2003-08-13 06:15:07 anonymous2 [Reply | View]
you need to just also include an MD5 file so people can check if their file is the real one. search for it on google not too sure whats the easiest way to do it but you see it all the time with big files like linux distros etc.
-
Managed Dedicated Server, 700GB/month for $300
2003-06-14 10:28:52 anonymous2 [Reply | View]
For savvy Linux system administrators, 700GB of traffic may be purchased for as little as $100. See http://www.rackshack.net/ . Companies offering staff-access-only, "in-house" dedicated servers enjoy several cost advantages over the old business "Exodus" model of expensive, high security buildings with leased cages.
For those who are not able to put in a few hours a month in maintaining and administrating a Red Hat Linux server, other options exist, including arranging for a system administrator to maintain a server on your behalf. I typically perform these services for $200 to $300 per month, depending on the specifics.
I think you might also find that Sprint's business DSL service offers lower bandwidth costs that Speakeasy. Last time I checked, they offered 6 Mbit (5 Mbit down, 1 Mbit up) for $160 per month, with no bandwidth restrictions.
-
BitTorrent gets verification right
2003-05-31 21:00:13 anonymous2 [Reply | View]
BitTorrent verifies content based on the .torrent file, but does not specify how the .torrent file itself is distributed or verified. This allows any number of existing mechanisms to be used for distributing and veifying the small .torrent files.
The .torrent files can be sent in signed mail messages, distributed though https, verified against md5sums and so on.
-
simple solution for peer to peer downloads
2003-05-31 18:02:36 anonymous2 [Reply | View]
Simply post the md5 (preferably sha) sum next to the filename on your site or next to the torrent link.
If people care they can confirm that they got your book.
You can even do it in a clean cross platform way by using something like www.md5summer.org which is compatible with the output of GNU md5sum. Pretty cool even if your just moving some important files around and want to make sure....
As far as file distribution on p2p there are some halfway decents ways to do it now with something called a "magnet uri" this is essentially a url with an embedded crc32ed(so it works in a url) sha sum of the file your looking at.
http://magnet-uri.sourceforge.net/
If its setup right in your browser on windows it will launch gnucleus or something and start searching for the file based on the sum. At least in theory, it's a bit hackish since it's not integrated.
I hope that there is some interest in folding the concept into the gnutella protocol. It would be a tremendous advance for p2p in that it would allow Golive Books, Game Patches and Starwars Kids to be easily replicated throughout the net.
Just need to get people using the protocol. It would GREATLY benefit p2p to have a higher percentage of legitimate traffic on it... apt-gnutella or rpmfind-gnutella anyone?
Adam of the domain devtty.net
-
Bittorrent guarantee identical to direct download
2003-05-30 22:58:36 anonymous2 [Reply | View]
BitTorrent confirms that the resulting files match the description in the torrent file. If you distribute the torrent file from your site, then you have the same level of assurance about the resulting files as you do for a direct download from your site.
BitTorrent would have been an excellent choice for you.
-
Bittorrent guarantee identical to direct download
2003-05-31 09:04:16 eggboard [Reply | View]
That's not exactly what I was writing about: if you distribute the file directly and have it distributed through the system, yes; but people downloading have no assurance that any file named X is the same file as X. You see what I mean? If you don't know the author, and there's no trust mechanism that currently allows you to "know" an author, then you don't know whether a file originated from the author who is distributing it or not.
Someone could take our book PDF, add a virus, bundle it up, and name it "Real_World_Adobe_GoLive_6.pdf.exe".
BitTorrent isn't an excellent choice at the moment for the reasons I cited: people can be fired, suspended, or expelled for using ANY P2P in many places, and our likely reader base wouldn't use a program like BitTorrent for reasons of obscurity (they're not in that community of people who know what P2P is) or job security. -
Bittorrent guarantee identical to direct download
2003-08-13 06:11:34 anonymous2 [Reply | View]
you need to just also include an MD5 file so people can check if their file is the real one. search for it on google not too sure whats the easiest way to do it but you see it all the time with big files like linux distros etc. -
Yes, but the problem is not specific to BitTorrent
2003-05-31 15:27:47 anonymous2 [Reply | View]
Let's review how BitTorrent works.
1. You create .torrent file from original content. The .torrent file contains crypto strong hashes of the original content.
2. You distribute .torrent file through website, mail or some other mechanism.
3. User's download content as described by .torrent file.
4. BitTorrent checks hashes in .torrent file.
The weak link here is step #2. User's don't have a strong guarantee that the .torrent file is the one you generated.
Note that this weakness is identical to direct download. Users do not have a strong guarantee that downloaded file named X is identical to the original file X.
You can make a stronger guarantee in the direct download case by using https, but the same holds true for distributing the .torrent file.
Just to be clear, you create the .torrent file containing the hashes and you distribute the .torrent file containing the hashes. The weak link is in the distribution and that weak link is identical to direct download.
By singling out this issue with BitTorrent, you lead readers to believe that this is a weakness of BitTorrent compared to direct download. There's a lot of FUD about p2p. It's sad to see that you are adding to it.
-
Yes, but the problem is not specific to BitTorrent
2003-05-31 15:58:35 eggboard [Reply | View]
You're reading the context wrong, and it is, in fact, different with direct download.
In the article, I cite the general problem as: "In peer-to-peer systems, however, you can't necessarily be sure that a given file is the same an author meant to upload, that the file has been vetted for viruses, or that each version of the file throughout a network is the same as every other file." Then I mention BitTorrent's method of crypto as a specific example of trying to solve one part of the problem that doesn't actually verify or vet the file. So that's a general-to-specific example, not a condemnation of BitT above other P2P.
Second, many sites do employ a variety of methods including MD5 and public key signing to ensure that a direct download is as promised. MD5, of course, only ensures that a file matches what's said on a Web site or in an email or newsgroup posting. If you use the methods recommended to obtain the verification of public keys used to sign downloads out of band (that is, not via a Web site or through email directly), then when you download a file, you can verify that the person or organization that you think created the file did, in fact, sign the file and it's been untampered with. (The cases in which this is a problem involve a lack of out-of-band confirmation of the public key, and so were more like just checksumming not ensuring integrity.)
So you're definitely RIGHT in that the problems are P2P based, but they're exacerbated by a distributed mechanism in that the "author" doesn't define where the downloaded file is authoritative from.
Obviously, a way to make this work better would be to tie in Web sites or subsites on a Web site that managed the crypto: signed files, etc., and have a streamlined method of obtaining keys or keys signed by other keys, so that any file in BitTorrent had to have some identity confirmed at the end of a chain, not just crypto hashing confirmation of the individual file.
It's definitely a global problem, but it's "solved" in the sense that sites like apache.org or sendmail.org use mechanisms that allow verification. If those files are then distributed through BitTorrent those same methods of verification work. -
Yes, but the problem is not specific to BitTorrent
2003-06-02 14:34:31 anonymous2 [Reply | View]
I think you're missing something. BitTorrent implements the solution you describe.
You as the author create the (small) .torrent file with the checksums in it. You host this on your webserver and link to it. The first step of a BitTorrent session is for the user to download these checksums directly from you.
Then BitTorrent does its peer to peer magic and retrieves the actual file (your pdf). The client checks the pdf against the .torrent file to ensure that what the user gets is exactly what you created.
If you still disagree, please read about the BitTorrent protocol. It's a very different beast than the Kazaas and Gnutellas of this world. For example.. there is no search engine built in. A user doesn't search inside BitTorrent for your book to obtain it. She goes directly to your website and clicks the BitTorrent link that you have set up. Thus her client can guarentee that she gets exactly what you want to give her.
Of course, as you mention, there is still a stigma against peer-to-peer programs in general. This is probably because most of these programs are really designed to make it easy to illegally share copywrited work.
BitTorrent is different. It's designed from the ground up to solve the very problem that you are having. As people get more comfortable using it, I think the stigma will begin to fade.
-
Yes, but the problem is not specific to BitTorrent
2003-06-02 14:34:09 anonymous2 [Reply | View]
I think you're missing something. BitTorrent implements the solution you describe.
You as the author create the (small) .torrent file with the checksums in it. You host this on your webserver and link to it. The first step of a BitTorrent session is for the user to download these checksums directly from you.
Then BitTorrent does its peer to peer magic and retrieves the actual file (your pdf). The client checks the pdf against the .torrent file to ensure that what the user gets is exactly what you created.
If you still disagree, please read about the BitTorrent protocol. It's a very different beast than the Kazaas and Gnutellas of this world. For example.. there is no search engine built in. A user doesn't search inside BitTorrent for your book to obtain it. She goes directly to your website and clicks the BitTorrent link that you have set up. Thus her client can guarentee that she gets exactly what you want to give her.
Of course, as you mention, there is still a stigma against peer-to-peer programs in general. This is probably because most of these programs are really designed to make it easy to illegally share copywrited work.
BitTorrent is different. It's designed from the ground up to solve the very problem that you are having. As people get more comfortable using it, I think the stigma will begin to fade.
-
Yes, but the problem is not specific to BitTorrent
2004-02-01 02:41:56 susy_miller [Reply | View]
Acctually, nothing is missed. It works, look closelly. If you find out the best documentation ever written in the matter, you might conclude that everything works fine.
__________________________________________________
Translated by Mail-Translator
-
Yes, but the problem is not specific to BitTorrent
2004-02-01 02:38:39 susy_miller [Reply | View]
Accctually, nothing is missed. It works, look closelly.
----------------------
Translated by Mail-Translator





