Gnutella and the State of P2P
February 03, 2001
written for the pho list
To be honest, Gnutella is not doing very well.
There are about 2300 gnutella hosts on the network accepting incoming
connections [statistic from LimeWire.com] that are part of the “big
cloud” (i.e., public network – it’s possible to set up independant
trading “clouds” on gnutella, too.) One interesting consequence of
this is that all of the public nodes are known and easily
harvestable. Since most nodes hooked up to gnutella are on high-speed
links whose IP addresses don’t change very often, unless you’re behind
a trusty firewall you may find yourself open to attack.
Gnutella prides itself on being a true peered network, but there are
less than a dozen centralized “hostcatchers.” Lacking these, the
network could not function. While it’d be easy to set up a new
hostcatcher if others were taken down, the fact of the matter is that
most gnutella programs come preprogrammed to connect to a specific
hostcatcher. Just send off little letters to the ISPs hosting those
servers and you could defeat the majority of gnutella users and make
it a real pain for new users to join.
There are a wide variety of clients (a good thing), but most of them
suck. It’s not really fair to be that critical of the original program
or even the gnutella architecture, since all it was was a quick hack /
beta implementation on Justin’s part (who has been gagged by AOL). I
can’t imagine how frustrated he must be at people (perhaps like me)
railing at gnutella’s flaws and its lack of ease of use when what’s
out there is a prerelease developer beta that he’s unable to continue.
There’s a pretty good review of the clients out there at LimeWire, but I have to say that BearShare is by far and large my favorite. It renders much more information and is much easier to use (IMHO) than any of the other clients.
But even with BearShare’s pretty and helpful UI, gnutella is not very
usable. The main reason is bandwidth. Just being connected to the
network (not downloading or uploading any files) is pushing about
100kbps of traffic onto the network, with bursts up to 250kbps. Most
cable modems are capped at about 128kbps of upstream traffic. Even
high-quality ADSL can’t usually push more than 100kbps or so, and
that’s already going to pretty much nuke your download capability
(ADSL is odd like that!). So just being connected to gnutella requires
you to be on “extreme broadband,” such as a work connection or an SDSL
line. This is again before you are actually uploading or downloading
any files.
There are other problems, too. Only one out of every ~15 search
results will actually be downloadable, and a good percentage of those
that do succeed are pretty slow (possibly because they are already
burning all of their bandwidth in simply participating on the
network). On napster, around 2 out of 3 search results are actually
downloadable, leading to an environment much more amenable to actual
usage.
On a deeper level, Justin’s gag on continued work on this project
means that this space will probably be pathetically without a true
leader. Sure, various figures pop up from time to time to try and
shepherd the gnutella development/user communities in various
directions (and are periodically mistaken as “the creators of
Gnutella” – as happened in the case of Gene Kan), but this really is
Justin’s aborted baby. Best, IMHO, to let it die and create something
anew to replace it.
I can assure you that that entity will not be gPulp (formerly known as
gnutellaNG, for Gnutella: Next Generation). For those of you not in on
the story, a number of people who had put together gnutella clones and
who understood some of the limitations of the original (unfinished)
gnutella architecture wanted to band together to create a
next-generation protocol and architecture that would provide a great
deal more robustness, speed, scalability, and flexibility. They
started talking, setting up a mailing list and website (at
gnutellang.wego.com) and passed around a lot of really great academic
papers on scalable peered systems (some of which are still sitting on
the message boards). But this group, since it was talking and not
coding, rapidly got very political. One member, Sebastien Lambla, who
had taken charge of the gnutellang website, spontaneously declared the
project renamed to gPulp and announced himself as the “gPulp
Consortium President.” Pretty silly, considering there wasn’t even an
attempt at consensus among the other developers. Sebastien proceeded
to turn gPulp into something different than a next-generation
gnutella, announcing that it would be a “resource discovery protocol”
(without actually specifying the mechanics of file transfer or other
peered solutions).
Sebastien, as if he wasn’t already making enough enemies, then decided
to set up gpulp.com/org/net and establish those as the new center. As
part of the move-over process he announced (and the announcement is
still on the gnutellang page) that he was going to be deleting all of
the message board content on the gnutellang website. Keep in mind here
that the discussions and resource links mentioned in those posts are
*the only valuable thing that the gnutellang group has done to date*. He redirects people to gpulp.tv as the new center for next-generation development. Yes, go there. No, you won’t be able to connect (as of 2/3/01 @ 1:30pm). Same with gpulp.com/net/org. Some new center, huh?
So Sebastien has pretty much singlehandedly defeated the possibility
of there being a half-decent next-generation gnutella. Any attempts to
critique him or to suggest that perhaps another course of action be
taken have been protested as an “attempt to split the community,”
since I think he believes he owns the whole of it. Ouch. Oh well. It’s
a textbook example of what not to do with an Open Source project, if
you want to learn from it.
Ohaha doesn’t seem to be going anywhere, but iMesh is still alive and kicking: there are 62,620 users online right now. Not bad, and about 30 times as many are on gnutella. Too bad there weren’t hardly any files actually available on the network and that those that were “available” (with five stars on “availability”) I couldn’t download. Scour Exchange is (obviously) no longer available. CuteMX, which seems to be alternately yanked then reinstated every few months, is back with a new 2.5 version. Freenet is as obtuse and
inaccessible as ever. Excellent academic work is being done on Jungle Monkey, but it’s not clear if a truly usable and popular client (and indeed, if indeed any Windows port) comes out.
Other than Napster, what other alternatives are out there and useable?
It seems that most of the more recent P2P plays in stealth mode are
targeted at “semi-centralized” models – this makes sense for
efficiency’s sake and also for creating a business model. An open
architecture with free implementations makes it difficult to make a
rollicking lot of money (or at least nobody has quite figured out how
this would work!). But P2P has people thinking, including people in
academia. A lot of theses have yet to be written and will undoubtedly
focus on efficient, decentralized systems.
I have a vision for two classes of systems. The first model is for
rich media and public access. Web pages, chunks of databases,
encrypted file backups, songs, movies, home videos and the like will
all flow through this system. There may be some “delegated” nodes
whose responsibility it is to help the network dynamically anneal to a
near-optimal dataflow by providing “global insight.” They won’t be
necessary, but they will greatly improve the efficiency of the
network. Hostcatchers are crude, first-generation implementations of
such delegated nodes.
The second class of system will be built for preservation of free
speech. It will only carry heavily encrypted, steganographically
embedded text. Through multiple layers, content from this network may
bubble up to the public network, thus allowing for not only anonymous
publication (mixmaster systems already do this) but for invisible
publication (i.e., your local network administrator doesn’t know
you’re publishing texts, even if she suspects it and is looking very
carefully at your network traffic). Text is the primary medium for
free speech and it is substantially easier to conceal than rich media:
even if you drape a blanket over an elephant, it’s pretty clear that
there’s still something there! Source code to “forbidden” programs,
memoranda on government abuses, etc. will all be published on this
network.
A number of people are working on the former class of systems, but
disturbingly few on the latter, which really, for all the hype of
other systems, is the one system that will guarantee free speech to
people.
But there is one key piece of the puzzle that needs to be in place for either system to work. It is absolutely necessary, and I’ve got to thank my roommate, Dan Kaminsky, for pointing it out to me. The upload pipe needs to be in place. Excite @Home has capped all of their cable modems to 128kbps upstream (they used to be able to push 1mbps+!). Pacific Bell modified their ADSL algorithms to even more destructively remove download capacity as soon as there is any upload in progress – this allows them to “permit” servers and P2P traffic but make it impractical to actually run them. SDSL providers like Covad
(my provider) are on the verge of bankruptcy. Alternate Internet access mechanisms (1- and 2-way satellite, fixed wireless, and Ricochet) all have a large amount of up/down disparity. Except for at universities and at really well connected workplaces, it may be
impossible to practically contribute much to a P2P system with all but the absolute most expensive ($1000+/mo) connections. Translation? P2P won’t be able to make it to the masses, other than leeching off of universities’ connections if this trend keeps up.
Let’s hope the power companies move on this whole fiber to the home thing, and fast. An 100mbps uplink in half the homes of America could prove the guarantee of free speech and an open, creative Internet for future generations. Lacking this, we may find ourselves restricted and floundering (as so brilliantly expounded by Jaron Lanier).
(Amazing how some things go full circle, ain’t it?)