<david.weekly.org> July 25 2008
writings Gnutella and the State of P2P
 
{ auf deutsch
en español
en français
}
  <d.w.o>
  about
  books
  code
  codecs
  mp3 book
  news
  pictures
  poems
  university
  wine
  writings
  video
  get my updates



don't email
written for the pho list

To be honest, Gnutella is not doing very well.

There are about 2300 gnutella hosts on the network accepting incoming connections [statistic from LimeWire.com] that are part of the "big cloud" (i.e., public network - it's possible to set up independant trading "clouds" on gnutella, too.) One interesting consequence of this is that all of the public nodes are known and easily harvestable. Since most nodes hooked up to gnutella are on high-speed links whose IP addresses don't change very often, unless you're behind a trusty firewall you may find yourself open to attack.

Gnutella prides itself on being a true peered network, but there are less than a dozen centralized "hostcatchers." Lacking these, the network could not function. While it'd be easy to set up a new hostcatcher if others were taken down, the fact of the matter is that most gnutella programs come preprogrammed to connect to a specific hostcatcher. Just send off little letters to the ISPs hosting those servers and you could defeat the majority of gnutella users and make it a real pain for new users to join.

There are a wide variety of clients (a good thing), but most of them suck. It's not really fair to be that critical of the original program or even the gnutella architecture, since all it was was a quick hack / beta implementation on Justin's part (who has been gagged by AOL). I can't imagine how frustrated he must be at people (perhaps like me) railing at gnutella's flaws and its lack of ease of use when what's out there is a prerelease developer beta that he's unable to continue.

There's a pretty good review of the clients out there at LimeWire, but I have to say that BearShare is by far and large my favorite. It renders much more information and is much easier to use (IMHO) than any of the other clients.

But even with BearShare's pretty and helpful UI, gnutella is not very usable. The main reason is bandwidth. Just being connected to the network (not downloading or uploading any files) is pushing about 100kbps of traffic onto the network, with bursts up to 250kbps. Most cable modems are capped at about 128kbps of upstream traffic. Even high-quality ADSL can't usually push more than 100kbps or so, and that's already going to pretty much nuke your download capability (ADSL is odd like that!). So just being connected to gnutella requires you to be on "extreme broadband," such as a work connection or an SDSL line. This is again before you are actually uploading or downloading any files.

There are other problems, too. Only one out of every ~15 search results will actually be downloadable, and a good percentage of those that do succeed are pretty slow (possibly because they are already burning all of their bandwidth in simply participating on the network). On napster, around 2 out of 3 search results are actually downloadable, leading to an environment much more amenable to actual usage.

On a deeper level, Justin's gag on continued work on this project means that this space will probably be pathetically without a true leader. Sure, various figures pop up from time to time to try and shepherd the gnutella development/user communities in various directions (and are periodically mistaken as "the creators of Gnutella" - as happened in the case of Gene Kan), but this really is Justin's aborted baby. Best, IMHO, to let it die and create something anew to replace it.

I can assure you that that entity will not be gPulp (formerly known as gnutellaNG, for Gnutella: Next Generation). For those of you not in on the story, a number of people who had put together gnutella clones and who understood some of the limitations of the original (unfinished) gnutella architecture wanted to band together to create a next-generation protocol and architecture that would provide a great deal more robustness, speed, scalability, and flexibility. They started talking, setting up a mailing list and website (at gnutellang.wego.com) and passed around a lot of really great academic papers on scalable peered systems (some of which are still sitting on the message boards). But this group, since it was talking and not coding, rapidly got very political. One member, Sebastien Lambla, who had taken charge of the gnutellang website, spontaneously declared the project renamed to gPulp and announced himself as the "gPulp Consortium President." Pretty silly, considering there wasn't even an attempt at consensus among the other developers. Sebastien proceeded to turn gPulp into something different than a next-generation gnutella, announcing that it would be a "resource discovery protocol" (without actually specifying the mechanics of file transfer or other peered solutions).

Sebastien, as if he wasn't already making enough enemies, then decided to set up gpulp.com/org/net and establish those as the new center. As part of the move-over process he announced (and the announcement is still on the gnutellang page) that he was going to be deleting all of the message board content on the gnutellang website. Keep in mind here that the discussions and resource links mentioned in those posts are *the only valuable thing that the gnutellang group has done to date*. He redirects people to gpulp.tv as the new center for next-generation development. Yes, go there. No, you won't be able to connect (as of 2/3/01 @ 1:30pm). Same with gpulp.com/net/org. Some new center, huh?

So Sebastien has pretty much singlehandedly defeated the possibility of there being a half-decent next-generation gnutella. Any attempts to critique him or to suggest that perhaps another course of action be taken have been protested as an "attempt to split the community," since I think he believes he owns the whole of it. Ouch. Oh well. It's a textbook example of what not to do with an Open Source project, if you want to learn from it.

Ohaha doesn't seem to be going anywhere, but iMesh is still alive and kicking: there are 62,620 users online right now. Not bad, and about 30 times as many are on gnutella. Too bad there weren't hardly any files actually available on the network and that those that were "available" (with five stars on "availability") I couldn't download. Scour Exchange is (obviously) no longer available. CuteMX, which seems to be alternately yanked then reinstated every few months, is back with a new 2.5 version. Freenet is as obtuse and inaccessible as ever. Excellent academic work is being done on Jungle Monkey, but it's not clear if a truly usable and popular client (and indeed, if indeed any Windows port) comes out.

Other than Napster, what other alternatives are out there and useable? It seems that most of the more recent P2P plays in stealth mode are targeted at "semi-centralized" models - this makes sense for efficiency's sake and also for creating a business model. An open architecture with free implementations makes it difficult to make a rollicking lot of money (or at least nobody has quite figured out how this would work!). But P2P has people thinking, including people in academia. A lot of theses have yet to be written and will undoubtedly focus on efficient, decentralized systems.

I have a vision for two classes of systems. The first model is for rich media and public access. Web pages, chunks of databases, encrypted file backups, songs, movies, home videos and the like will all flow through this system. There may be some "delegated" nodes whose responsibility it is to help the network dynamically anneal to a near-optimal dataflow by providing "global insight." They won't be necessary, but they will greatly improve the efficiency of the network. Hostcatchers are crude, first-generation implementations of such delegated nodes.

The second class of system will be built for preservation of free speech. It will only carry heavily encrypted, steganographically embedded text. Through multiple layers, content from this network may bubble up to the public network, thus allowing for not only anonymous publication (mixmaster systems already do this) but for invisible publication (i.e., your local network administrator doesn't know you're publishing texts, even if she suspects it and is looking very carefully at your network traffic). Text is the primary medium for free speech and it is substantially easier to conceal than rich media: even if you drape a blanket over an elephant, it's pretty clear that there's still something there! Source code to "forbidden" programs, memoranda on government abuses, etc. will all be published on this network.

A number of people are working on the former class of systems, but disturbingly few on the latter, which really, for all the hype of other systems, is the one system that will guarantee free speech to people.

But there is one key piece of the puzzle that needs to be in place for either system to work. It is absolutely necessary, and I've got to thank my roommate, Dan Kaminsky, for pointing it out to me. The upload pipe needs to be in place. Excite @Home has capped all of their cable modems to 128kbps upstream (they used to be able to push 1mbps+!). Pacific Bell modified their ADSL algorithms to even more destructively remove download capacity as soon as there is any upload in progress - this allows them to "permit" servers and P2P traffic but make it impractical to actually run them. SDSL providers like Covad (my provider) are on the verge of bankruptcy. Alternate Internet access mechanisms (1- and 2-way satellite, fixed wireless, and Ricochet) all have a large amount of up/down disparity. Except for at universities and at really well connected workplaces, it may be impossible to practically contribute much to a P2P system with all but the absolute most expensive ($1000+/mo) connections. Translation? P2P won't be able to make it to the masses, other than leeching off of universities' connections if this trend keeps up.

Let's hope the power companies move on this whole fiber to the home thing, and fast. An 100mbps uplink in half the homes of America could prove the guarantee of free speech and an open, creative Internet for future generations. Lacking this, we may find ourselves restricted and floundering (as so brilliantly expounded by Jaron Lanier).

(Amazing how some things go full circle, ain't it?)

  
  content & layout © copyright 1995-2008 -{ david e weekly }-