| <david.weekly.org> | January 6 | 2009 | |
| projects | SafeX | ||
|
SafeX = Secure and Anonymous File EXchange.
The IdeaTo get on SafeX, you download a small client with which you then construct a list of your friends on SafeX. A small dialog box pops up on each friend's box, asking if you are to be trusted. A PGP fingerprint might be presented to be verified over the phone.Your computer then records the IPs of the SafeX nodes with which it has established a trust relationship and securely records it to disk. A private key, used to encrypt all transactions and corresponding disk storage, could be uploaded to a Rio player or any such flash-memory based device for secure, portable storage and encrypted itself with a pass phrase. (This is actually the exact purpose for which I wrote the first piece of code to upload a file on the Rio to a computer.) The PGP protocol would ensure that the message was not only encrypted but also authenticated to come from a given node, preventing man-in-the-middle attacks and snooping. A quick "panic!" command from a given node would immediately erase all local transactions, flush the secure cache (more on this later), and inform all connected nodes to erase the transmitting node's information from their repositories. Information on SafeX would be presented as a giant, hierarchical list of files and (anonymized or authenticated) discussions. You would provide your client with a list of files and/or directories that you wished to post on SafeX. The client would run a secure hash (e.g. MD5) on each file to fingerprint it and publish the filenames and hashes to all connected nodes. The other nodes would, in turn, broadcast their "accessible" lists (about to be explained!). The client then checks for hash collisions, sorts, and creates a concatenated list of all of the files that it can access, locally or through a connected client. This "accessible" list ("You can call me AL") is then rebroadcast to all listening hosts. Each computer stores locally an identifier that tells it which PC it heard about each file from. A rough metric of "network distance" is appropriately summed and attached to each item, thus allowing for a self-optomizing route through SafeX to the "closest" copy of a file. Some of you might point out that this will involve the transmission of redundant information: this is true. However, it A) simplifies the protcol B) doesn't matter since there will simply be a hash collision and the duplicate will be discarded and C) makes it less prone to probablistic analysis. With regards to probablistic analysis, some of you may point out that having lists of differing size would leak some information to an observer. This could be easily circumvented by first sending an encrypted broadcast request for the size of the list and then sending the encrypted request: the servers would pad their lists to be as long as the longest one, revealing nothing about who knew how much. Back to the system... When a request for a given file is made, the request is encrypted with the public key of the machine that originally advertised the file's availability and broadcast to attached SafeX nodes. All of the nodes but the target discard the junk packet: the target node decrypts the packet and deciphers the request. If the file is local to the target, it feeds the data back to the client. If the data is not local to the target, the target repeats this "pseudo-broadcast" to the "nearest" node to the file. When the response is recieved it is conditionally cached (depending on user preferences) and retransmitted to the client. It is quickly seen that such a protocol could potentially reveal the true transmitter of the file if the network was under strict supervision. The solution is for answering nodes to feed *all* attached nodes the encrypted information. Given a large number of attached nodes, the linearly scaling nature of this solution makes it relatively impractical to implement, save for on extremely high-speed networks (i.e., University ethernet and not 28.8!). The interesting alternative is to implement broadcast through multicast, if available. While this would certainly still not solve the problem of non-target nodes receiving an excessive number of packets that they would have to discard, it would significantly ease network and server usage. Let's do a quick walk-through: 'A' downloads SafeX. 'A' claims 'B' and 'C' as friends. Both accept. 'C' is connected to 'D' and 'E' who both have massive file repositories. 'C,' 'D,' and 'E,' are all friends with 'F,' 'G,' and 'H.' 1) 'A' compiles a local list of files, computes their checksums, and transmits the list in separate transmissions to B and C -- encrypted with each node's respective keys. 2) B and C each append As list to their own, adding the "network distance" from each to A and recording that the file was obtained from A. 3) B and C both respond to A by announcing the length of their lists. 4) A calulates the greater of the two and transmits a list request to B and C. 5) B and C send their lists to A, padding the result if neccessary. 6) A appends B and Cs list to its own, appropriately adding the network distance and noting which files came from who. 7) User on A selects a file in the list for downloading. This file resides on D. 8) A broadcasts a file request to both B and C, encrypted with Cs key, since the file was in Cs transmitted access list. 9) B ignores the packet and C retransmits the request to D, E, F, G, and H, but encrypted with Ds key, since D registers the shortest network path to the file. (F could potentially get the file from D and then C could get it from F, but getting it straight from D will likely be fastest.) 10) D encrypts the file with Cs key and transmits it (perhaps after a random amount of time between 1-10 minutes) to C, F, G, and H. A random amount of buffer is added to the file to prevent guessing at the contents from the length of transmission. 11) C decrypts the file and stores it locally. F, G, and H discard the unreadable packets. (They don't know who requested the file, or even what it was!) 12) C encrypts the file with As key and transmits it to A, D, E, F, G, and H after a random amount of time and with a random amount of padding. 13) D, E, F, G, and H ignore the unreadable packets. A decrypts her file and stores it locally! A simple variant on this could involve passing As public key along with the request -- the originating server would have a second (internal) encryption with As key. This would prevent caching but would prevent middle (even though admittedly trusted) stations from observing the unencrypted content. Note that D only knows that C is requesting a file and doesn't know whether C is a proxy or a client. Note also that A doesn't know who C is getting the file from (although A might be able to deduce some information about this from the "network distance" metric -- while this parameter could be left out to increase (superanal!) security it might make managing the SafeX topology considerably more difficult). As for bandwidth efficiency in the protocol...well, let's just say that without multicast it's quite difficult to have a bandwidth-efficient algorithm that can defeat probabilistic analysis of the network. And a key fact is brutally exposed: which computers are connected to SafeX and to which others. Other than dumping bytes randomly to random IPs (further increasing unneccessary bandwidth!) I'm not sure of a good way around this. Any very good & complete setup will make the network look like nothing but a hell of a lot of white noise: sort of like building a giant man-made waterfall so that one can stand under it and whisper without being heard. Yes, for anybody who was wondering, I *am* in the middle of Cryptonomicon. A heartily recommended read for any of you. I hope the detail I have outlaid in SafeX (no, I have not implemented it, partly for time and partly for moral issues) makes it clear that it is possible to create a system capable of quietly and anonymously serving a large quantity of files and messages. (I have some preliminary ideas for a anonymous and/or authenticated Instant Messaging system that sits on top of SafeX.) It is understood that there may be some significant shortcomings and holes and unthought-out portions to this concept, but the groundwork exists sufficient to prove such a project as being plausible and relatively bullet-proof. (If the RIAA thought they were being high-tech by looking sites up with Yahoo! and Network Solutions, they have seen nothing yet.)
The ResponseIt seems pretty well thought out. I'm wondering if you've read the paper Ron Rivest did on (I believe he called it) winnowing and chafing. Which, if I'm remembering correctly, would have a similiar result but the bandwidth usage would be much smaller. The anology would be instead of building a giant waterfall... you would make it look like there was a giant waterfall so you couldn't see the real trickle of data behind it.
The ReplyThanks for the response and the pointer to Rivest's article (online at http://theory.lcs.mit.edu/~rivest/chaffing.txt -- found it with Yahoo!). It's a very interesting paper, but it would appear that chaffing & winnowing are quite different from SafeX. While SafeX does hope to make transmissions unreadable (via encryption) SafeX also tries to make it so that it is unclear even who is transmitting to who and to link an A->B transmission with a B->C transmission. One thing that I left out of SafeX was the possibility for machines to also randomly send out completely junk packets, especially after any other packet has been received. Such a (semi-)continuous stream of packets would make it very difficult to discern when an actual SafeX packet was being sent, let alone to whom, let alone what the request might contain! The time may come soon to start building SafeX. I don't think I'll put it together unless I believe there to be a dire need... WRT Rivest's article, initial effective encryption is (initially) achieved with a minimum of a 200x increase in data flow. A 4kb message would thus take nearly a megabyte of secure transmission. His second iteration, using package transforms, uses less data but is seems quite susceptible to brute force cryptanalysis for smaller messages: a 10kb message chunked into 10 1kb wheat chunks with even 10 1kb chaff chunks yields a mere 1024 possibilities for an attacker to chew through to come up with the original message. I think that with SafeX it should be possible to sufficiently mask data transfers by sending bogus packets to only ~3 other computers (not all attached nodes!) -- this would make for a 4x increase in data over what was required, but this is a reasonable goal for a high-speed network. Note also with SafeX that the distributed, hierarchical caching built into the system delegates the job of serving files to the entire network, easing load on originators of content. UPDATE: looks like FreeNet is implementing a subset of SafeX's features. Cool!
| ||
| content & layout © copyright 1995-2008 -{ david e weekly }- | ||