This document describes the Conjure Bittorrent client to the interested reader. It is my (jlouis@mongers.org) intent to describe the system in order to make it easier to begin hacking away on the beast. This document is separated into a number of sections, each describing a certain part of the system to the reader. Part one -- The B-coding: Bittorrent is using a scheme known as the B-coding for all torrent files. This B-coding defines nothing more than a simple syntax tree with String and Integer types as well as lists and dictionaries. There is no magic to this coding scheme, and the advantage is that it is easy to extend the protocol with new features. We parse the B-coding in BEncode.BEncode. We can read B-coded strings to an AST type and vice versa. Nothing is gained by the Bcoding however, as it is rather hard to manipulate and fetch things inside the B-coding. Therefore, there are 2 helpers to make it easier to work on the B-code: BEncode.BType defines a simple type checker for a subset of the B-coding scheme (Lists are monotyped). With this tool, it is possible to type check B-codings to see if they abide the structure of a valid torrent file. In particular, we check that the fields we need are present in the torrent file and that we actually can read out sensible values from the AST. Next we can check wether the torrent file contains single or multiple files. The module Conjure.TorrentType contains type checks for Torrent files and Peer lists. It is split from the general torrent structure to facilitate the further development of each piece. The structure Conjure.Torrent builds the last part of the parsing of torrent files. Conjure.Torrent.Torrent is the type of torrents, with accessor functions directly interfacing inside the BEncode.BEncode structure. Thus, we avoid a number of intermediate structures and query directly into the B-coding, but the type checker ensures that this is a safe operation to do. Part two -- The FS storage manager: In a Bittorrent client it is essential to have effective filesystem storage. Usually, this is the step that determines the speed of the client. Disk I/O is important. We opt for a solution where we have several layers each building a new layer of abstraction. At the very lowest level is the HandlePool, it contains a dictionary of recent file handles and these are replaced by an LRU-scheme. The advantage is that recent file-handles are already open and we can avoid open/close storms on the operating system. On top of the HandlePool sits the Store. It provides a very convenient abstraction: it allows us to work on slots of the same size as individual pieces. That is, instead of having to keep track of the details of the underlying storage method, the Store allows us to make queries such as: ``Read slot x'' or ``Store Piece y to slot z''. On top of the Store is the StorageManager. The StorageManager is responsible for SHA1 checksumming files. It is responsible for moving around pieces in the slots when it is needed and it is responsible for handling everything related to the rest of the client for FS needs. With this layered structure it should be relatively easy to change slow parts in the future, if anything comes up (e.g. via profiling). To bind everything together, we have a FSThread planned. It works as a multiplexer for File system requests in the rest of the code. In principle it should be able to do any kind of FS operation the rest of the software want to have carried out. Part three -- The announcer: The announcer is a separate thread responsible for talking with the Tracker. The announcer must not bomb the tracker so it makes use of a timer thread to hold it down between queries. It has 2 channels it communicates on: one for sending and one for recieving messages. The inbound channel contains messages of things which the client wants to query the server about. The outbound channel contains (parsed) results from the server. By using a structure like this, it is possible to decouple tracker communications from the rest of client. It can later be changed to use DHT's probably when we want to support Khasmir and friends. Part four -- The InterestTable: The InterestTable is a straightforward steal from the original bittorrent client, mixed with a little bit of ``Not Invented Here'' (NIH) syndrome. Whenever we get messages about who has which pieces of the torrent, we send the message to piecepicker with the information. We can also query its structures for what the most interesting piece is right now, if we have requested it from another source and so on. It is crucial to correct client operation and it is not very hard to write at all. The interface to the InterestTable is through the InterestStatThread. The idea is the following: Upon start, we spawn an InterestStatThread and give it various parameters, like Number of torrent pieces and so on. It builds a InterestTable structure and begins to answer queries on a Channel (STM encapsulated) and registers various bits of information that are sent to it. Currently the protocol for conversation with InterestStatThread consists of following queries (denoted by [Q]) and information messages (denoted by [I]): HavePiece p_num chan [Q] -- We look up if we currently have this piece and returns our result on the channel given. As such, the InterestStatThread keeps track of the current piece layout. It might be wiser to have this information stored in the FS system though (TODO). RequestedFromSeeder pnum [I] -- Tell the interest table we have requested a given piece from a seeder. The table keeps track of this. RequestedFromLeecher pnum [I] -- Tell the interest table we have requested a given piece from a leecher. The table will track this. PeerGotPiece pnum [I] -- A peer just got a given piece. This is used to update the internal histogram to reflect that the given piece is more common than before. PeerLostPiece pnum [I] -- When a peer disconnects, we record this in the table to designate that some pieces just became more rare. Namely the pieces the peer had (FIXME: maybe it should be [pnum] here?) PieceCompeleted pnum [I] -- When we complete a piece, we are not interested in it anymore. Tell the table this, so it can track the change. NextPiece have_predicate return_chan [Q] -- Select the most interesting piece, where we filter by the have_predicate. Return the answer on the return_chan. The smart thing is that each peer sends a have_pred based on what the given peer has available and the operation then selects the best piece according to this predicate. Shutdown [I] -- Tell the Table to shut down. Part five -- The Peer client thread: Part six -- The Master thread: Master thread should set up all other threads and communication channels between them, and orchestrate whole process of downloading. Current vision of threads inter-relationships is depicted on the sequence diagram threads-orchestration.ps (produced by tsdq from TCM toolkit, original is in threads-orchestration.sqd) Part seven -- The Logger: Logger (Conjure.Logger) provides the common logging.reporting facilities. It is modeled after syslog(3): all messages have two attributes, "facility" (crrently hidden from API user) and priority. For user convenience, a set of helper functions (debugM, infoM, noticeM, etc) provided, each of which submits message with respective priority. Currently logger logs all messages with priority WARNING or above to the STDERR. If verbose logging was requested, all messages are logged to STDERR. Framework provides a room for expansion, so we can expect a separate facility for each program module, different log targets for different facilities/priorities, etc. If you have any experience with syslog configuration, you got the idea. For more information, consult MissingH manuals