Give users the ability to bulk import raw data from a single CSV file into Concourse.
This feature will enable more testing because it will be easier to seed Concourse with large and diverse data sets. This feature will also lead to increased Concourse usage by third party applications because they will have the ability to take data from their existing stores and bring it into Concourse easily.
The Import Framework is designed to provide an out of the box pipeline to easily bring raw data into Concourse while also defining an extensible framework that can be used to implement custom logic for specific imports.
A user wants to take a CSV file (with or without headers) and atomically import the data into Concourse using a CLI. We choose CSV as the supported format because it can be converted from other formats (i.e. xls, sql, etc) and has support in many languages. In the future, we may provide out of the box solutions for additional formats.
A user wants to import individual lines of a CSV file into a new record.
A user wants to import individual lines of CSV file into one or more existing records by specifying a resolveKey. For each line, the importer should find all the records that have a key mapping to a value equal to one of the values associated with the resolveKey in the line. The importer should then import all the data in the line into each of those records.
A user wants to import data that contains links to existing records by declaring or transforming raw data into a resolvable link. When the importer encounters a resolvable link, similar to a resolveKey, it finds all the records that have the key specified in the resolvable link mapping to the value specified in the resolvable link on the line. The importer should then link the key associated with the resolvable link on the line to all those resolved records.
Lets assume I customer data in Concourse. Each customer record has a "customer_id" key that maps to a numeric value. Now I want to import account data, where each account record has a foreign key "reference" to customer_id of the the customer that owns the account. So, I want to import that data and link to the appropriate customer record that already exists. Using the Import Framework, I should be able to do this by specifying a resolvable link in my raw data.
account_number | customer | account_type |
---|---|---|
12345 | @<customer_id>@678@<customer_id>@ | SAVINGS |
This means that I want to import the line into a new record and link the "customer" key in the record to all the records that have a "customer_id" key that maps to 678.
We should provide a utility and mechanism in the framework for the user to easily convert raw data to a resolvable link without having to know the appropriate format. |
Feature | Description | Notes |
---|---|---|
Config Framework | Create an IV framework to handle reading/writing concourse configuration files | ![]() |
CLI Framework | Create an IV framework to facilitate the creation of client side CLIs that interact with Concourse | ![]() |
Test Framework | Create an IV framework that provides a mechanism for spinning up Concourse test environments | ![]() |
Convert raw string data to appropriate java objects | Write shareable logic that can be used to convert raw string data to the appropriate java object based on sensible rules | ![]() |
Abstract logic to import into new records | Create logic to import a single line/group of data into a new record | ![]() |
Abstract logic to import into existing records | Create logic to import a single line/group of data into one or more existing record by using the resolveKey to find the appropriate records | ![]() |
Utility for converting raw data to a resolvable link | Create some utility method(s) to convert raw data into the format that specifies a resolvable link. This utility should not alter the raw data, but it should convert it in memory and pass it off to the rest of the import logic. | ![]() |
Abstract logic to import resolvable links | Create logic to handle resolvable links | ![]() |
Abstract wrapping of imports in a single transaction | Make sure that the import happens in a single transaction | ![]() |
Generic CSV importer | Write an importer that can handle a generic CSV file with headers | ![]() |
Generic CSV import cli | Write a CLI that uses the generic CSV importer | ![]() |
Package csv import cli as standalone app | Package the generic CSV importer and cli as an application that can be run from anywhere (possibly on windows too!) | |
Package csv import cli with concourse-server | Package the generic CSV importer and cli with concourse-server (similar to what is done with CaSH). |