Configuration - Inputs

inputs

Type: Array of Object.Each item in this array must match one of the following definitions.

An input defines the data source and format to be used in RODB. It also defines how to translate each record in the data source to an internal structure to be used in RODB.

Example:

inputs:
  - name: cities
    type: csv
    path: ./cities.csv
    ignoreFirstRow: true
    columns:
      - name: name
        parser: string
      - name: prefecture
        parser: string
      - name: population
        parser: integer
  - name: countries
    type: json
    path: ./countries.json

CSV

inputs[type = “csv”]

Type: Object

A CSV input reads data from a CSV file. Each CSV row translates in one record. This input provides several settings described below.

Examples:

name: cities
type: csv
path: ./cities.csv
ignoreFirstRow: true
columns:
  - name: name
    parser: string
  - name: prefecture
    parser: string
  - name: population
    parser: integer

name: countries
type: csv
path: ./cities.csv
ignoreFirstRow: true
autodetectColumns: true
delimiter: ","

Properties:

name

inputs[type = “csv”].name

Type: String

The name of this input, which any other component will use to refer to it.

type

inputs[type = “csv”].type

Must have the value: "csv"

path

inputs[type = “csv”].path

Type: String

The relative or absolute path to the data file on this filesystem.

RODB only reads the input files from the filesystem. Other data sources (NFS, SSH, S3…) should be handled by mounting them in the filesystem using external tools (Docker or Kubernetes volumes, sshfs, s3fs…).

delimiter (optional)

inputs[type = “csv”].delimiter

Type: String

Default value: ","

The character used by the CSV file to separate each column inside a record.

autodetectColumns (optional)

inputs[type = “csv”].autodetectColumns

Type: Boolean

Whether or not RODB should attempt to detect automatically the column list using the first row of the file. When this parameter is true, the columns parameter must not be defined.

The property names in the parsed record are strictly identical to the content of the first row of the CSV. No transformation being made, case, spaces or special characters are included in the final property names.

When this parameter is used, all columns internally gets the string type, using the string parser. No type detection or casting is attempted.

Enabling this setting does not prevent the first row from being included in the data. To do this, please refer to the ignoreFirstRow parameter.

ignoreFirstRow (optional)

inputs[type = “csv”].ignoreFirstRow

Type: Boolean

Whether or not the first row of the CSV file should be ignored. When true, the first row is not included in the data. When false, the first row is included in the data.

Even if autodetectColumns is enabled, the first row will still be included in the data unless ignoreFirstRow is set to true.

columns (optional)

inputs[type = “csv”].columns

Type: Array of Object. Empty array not allowed.

This parameter is required, unless autodetectColumns is true (in which case it must not be set).

If the number of defined columns is lower than the columns appearing in the data, the remaining columns are ignored and not included in the record. If the number of defined columns is higher than the columns appearing in the data, the trailing columns are assigned the null value.

Items of the array

inputs[type = “csv”].columns[]

Type: Object

Properties:

name

inputs[type = “csv”].columns[].name

Type: String

The name of this column. It is the same that is used by the other components of RODB to refer to this property. It must be unique among this input.

There are no restrictions to the content of this name. Any unicode string is valid.

parser (optional)

inputs[type = “csv”].columns[].parser

Type: String

Default value: "string"

The name of the parser to apply on this column’s value.

dieOnInputChange (optional)

inputs[type = “csv”].dieOnInputChange

Type: Boolean

Default value: true

RODB internally identifies each record using it’s binary offset in the file. Because of this, any change in the data file while RODB is running can move those offsets, thus corrupting the indexes. To avoid returning corrupted data, the default behaviour of RODB is to stop the service with an error whenever it happens. While not recommended, setting this property to false would prevent RODB from stopping when the data source changes.

JSON

inputs[type = “json”]

Type: Object

A JSON input reads data from a file containing one JSON document per row. Each CSV document translates in one record and must be an object.

Files containing a single JSON array of records are not supported.

Example:

name: countries
type: json
path: ./countries.json

Properties:

name

inputs[type = “json”].name

Type: String

The name of this input, which any other component will use to refer to it.

type

inputs[type = “json”].type

Must have the value: "json"

path

inputs[type = “json”].path

Type: String

The relative or absolute path to the data file on this filesystem.

dieOnInputChange (optional)

inputs[type = “json”].dieOnInputChange

Type: Boolean

Default value: true

XML

inputs[type = “xml”]

Type: Object

An XML input reads data from an XML file. Each record or value inside the records is defined using XPath. This input provides several settings described below.

Example:

name: users
type: xml
path: ./users.xml
recordXpath: "//User"
properties:
  - name: id
    parser: integer
    xpath: "/Id"
  - name: name
    parser: string
    xpath: "string(//Name)"
  - name: roleIds
    type: array
    xpath: "number(/Roles/Role/@Id)"
    items:
      parser: integer
  - name: manager
    type: object
    xpath: "/Manager"
    properties:
      - name: id
        parser: integer
        xpath: "/@Id"
      - name: name
        parser: string
        xpath: "/Name"

Properties:

name

inputs[type = “xml”].name

Type: String

The name of this input, which any other component will use to refer to it.

type

inputs[type = “xml”].type

Must have the value: "xml"

path

inputs[type = “xml”].path

Type: String

The relative or absolute path to the data file on this filesystem.

recordXpath

inputs[type = “xml”].recordXpath

Type: String

An XPath expression that returns a collection of each XML node that should be a record in the resulting data.

The XPath implementation available in RODB is based on the antchfx/xpath Golang package, whose available syntax is described here.

properties

inputs[type = “xml”].properties

Type: Array of Object. Empty array not allowed.

Each item in this array describes a different property in the resulting object.

Items of the array

inputs[type = “xml”].properties[]

Type: Object

Properties:

name

inputs[type = “xml”].properties[].name

Type: String

The name of this property. It is the same that is used by the other components of RODB to refer to it. It must be unique in this object.

There are no restrictions to the content of this name. Any unicode string is valid.

type (optional)

inputs[type = “xml”].properties[].type

Type: String

Defines the data type of this property. Please refer to the details of the other parameters.

xpath

inputs[type = “xml”].properties[].xpath

Type: String

An XPath expression that returns the value to assign to this property. The root node of the XPath expression is the node returned by the parent’s XPath expression.

If the returned value is a node, the type must be object. If the returned value is a collection of nodes, the type must be array. Otherwise, the type must be primitive (which is the default value).

The XPath implementation available in RODB is based on the antchfx/xpath Golang package, whose available syntax is described here.

parser (optional)

inputs[type = “xml”].properties[].parser

Type: String

Default value: "string"

This parameter is only valid when the type parameter is set to primitive. The value returned by the XPath expression must be either a string (in which case the value will be processed by the given parser) or of the same type than the parser.

items (optional)

inputs[type = “xml”].properties[].items

Type: Object

This parameter is only valid when the type parameter is set to array. The definition of this object is the same than the items of the currently described properties array. Each node in the collection will be processed and parsed individually using the given definition.

properties (optional)

inputs[type = “xml”].properties[].properties

Type: Array

This parameter is only valid when the type parameter is set to object. The definition of this array is the same than the currently described properties array.

dieOnInputChange (optional)

inputs[type = “xml”].dieOnInputChange

Type: Boolean

Default value: true