Understanding the Codebase
concourse.git
The concourse.git repository houses the core Concourse projects. These projects depend on one another and are versioned together, so they must build at the same time in order to produce properly functioning artifacts. The concourse project contains the client code. The concourse-server project contains the server code. The client and server talk to each other using thrift rpc. The concourse-shell project contains the code for the interactive command line client that is packaged with the server. The concourse-testing project contains integration tests are are too expensive to run on every build.
concourse
- takes input from the user,
- converts the input to a form that can be serialized by Thrift (i.e. every java Object gets converted to a TObject),
- sends the transformed data over the wire using Thrift,
- waits for the server to send back a response, and
- transforms the server's response to a more user friendly format (i.e. every TObject gets converted to a java Object, some java collections get converted to formats with better toString output, etc).
For every API methods that is considered a CompoundOperation, the client does a bit of work to divide the user input into the appropriate server calls and some post-filtering on the response sent back from the server.
concourse-server
The server project is where most of the interesting code lives and the most important server concept is the Engine. In general, the client receives data/requests from the user which are sent over the wire to ConcourseServer and then passed onto the Engine, which actually manages data storage and retrieval.Ironically, the Engine manages these processes, but actually does very little work directly. For the most part, the Engine delegates work to its subcomponents: the Buffer and the Database. Concourse's storage model is a system that initially stores data in a temporary buffer before transporting and indexing the data in a permanent database. The Buffer and Database classes contain the logic for handling data in their respective formats and the Engine class contains the logic for reconciling (possibly conflicting) results from the two stores and ensuring that data is passed between them in a safe and consistent manner. The Buffer is fairly straight forward linked list of revisions. The Database, on the other hand, is a complex system that stores three different views of data on disk in Blocks and in memory as cached Records. Blocks are responsible for efficiently sorting data on the fly, writing data to disk and loading data into memory. Records are responsible for computing the appropriate response to queries and read requests.
There are also Transactions, which are, in essence, mini-Engines that are used when the client stages operations in an ACID transaction. Just like the Engine, each Transaction is a BufferedStore which initially stores data in an in-memory Queue and eventually transports the data to the main Engine when the user commits the transaction.
Finally, the role of the ConcourseServer class is to take requests from the client and to decide if they should be sent directly to the main Engine or routed in a Transaction first (e.g. is the client in staging or autocommit mode). Additionally, the ConcourseServer class manages security and controls access using an AccessManager.