Configuration - Inputs
Type: Array of Object.Each item in this array must match one of the following definitions.
An input defines the data source and format to be used in RODB. It also defines how to translate each record in the data source to an internal structure to be used in RODB.
Example:
inputs:
- name: cities
type: csv
path: ./cities.csv
ignoreFirstRow: true
columns:
- name: name
parser: string
- name: prefecture
parser: string
- name: population
parser: integer
- name: countries
type: json
path: ./countries.json
CSV
Type: Object
A CSV input reads data from a CSV file. Each CSV row translates in one record. This input provides several settings described below.
Examples:
name: cities
type: csv
path: ./cities.csv
ignoreFirstRow: true
columns:
- name: name
parser: string
- name: prefecture
parser: string
- name: population
parser: integer
name: countries
type: csv
path: ./cities.csv
ignoreFirstRow: true
autodetectColumns: true
delimiter: ","
Properties:
name
Type: String
The name of this input, which any other component will use to refer to it.
type
Must have the value:
"csv"
path
Type: String
The relative or absolute path to the data file on this filesystem.
RODB only reads the input files from the filesystem. Other data sources (NFS, SSH, S3…) should be handled by mounting them in the filesystem using external tools (Docker or Kubernetes volumes, sshfs, s3fs…).
delimiter (optional)
Type: String
Default value: ","
The character used by the CSV file to separate each column inside a record.
autodetectColumns (optional)
Type: Boolean
Whether or not RODB should attempt to detect automatically the column list using the first row of the file.
When this parameter is true
, the columns
parameter must not be defined.
The property names in the parsed record are strictly identical to the content of the first row of the CSV. No transformation being made, case, spaces or special characters are included in the final property names.
When this parameter is used, all columns internally gets the string
type, using the string
parser. No type detection or casting is attempted.
Enabling this setting does not prevent the first row from being included in the data. To do this, please refer to the ignoreFirstRow
parameter.
ignoreFirstRow (optional)
Type: Boolean
Whether or not the first row of the CSV file should be ignored.
When true
, the first row is not included in the data.
When false
, the first row is included in the data.
Even if autodetectColumns
is enabled, the first row will still be included in the data unless ignoreFirstRow
is set to true
.
columns (optional)
Type: Array of Object. Empty array not allowed.
This parameter is required, unless autodetectColumns
is true (in which case it must not be set).
If the number of defined columns is lower than the columns appearing in the data, the remaining columns are ignored and not included in the record.
If the number of defined columns is higher than the columns appearing in the data, the trailing columns are assigned the null
value.
Items of the array
Type: Object
Properties:
name
Type: String
The name of this column. It is the same that is used by the other components of RODB to refer to this property. It must be unique among this input.
There are no restrictions to the content of this name. Any unicode string is valid.
parser (optional)
Type: String
Default value: "string"
The name of the parser to apply on this column’s value.
dieOnInputChange (optional)
Type: Boolean
Default value: true
RODB internally identifies each record using it’s binary offset in the file.
Because of this, any change in the data file while RODB is running can move those offsets, thus corrupting the indexes.
To avoid returning corrupted data, the default behaviour of RODB is to stop the service with an error whenever it happens.
While not recommended, setting this property to false
would prevent RODB from stopping when the data source changes.
JSON
Type: Object
A JSON input reads data from a file containing one JSON document per row. Each CSV document translates in one record and must be an object.
Files containing a single JSON array of records are not supported.
Example:
name: countries
type: json
path: ./countries.json
Properties:
name
Type: String
The name of this input, which any other component will use to refer to it.
type
Must have the value:
"json"
path
Type: String
The relative or absolute path to the data file on this filesystem.
RODB only reads the input files from the filesystem. Other data sources (NFS, SSH, S3…) should be handled by mounting them in the filesystem using external tools (Docker or Kubernetes volumes, sshfs, s3fs…).
dieOnInputChange (optional)
Type: Boolean
Default value: true
RODB internally identifies each record using it’s binary offset in the file.
Because of this, any change in the data file while RODB is running can move those offsets, thus corrupting the indexes.
To avoid returning corrupted data, the default behaviour of RODB is to stop the service with an error whenever it happens.
While not recommended, setting this property to false
would prevent RODB from stopping when the data source changes.
XML
Type: Object
An XML input reads data from an XML file. Each record or value inside the records is defined using XPath. This input provides several settings described below.
Example:
name: users
type: xml
path: ./users.xml
recordXpath: "//User"
properties:
- name: id
parser: integer
xpath: "/Id"
- name: name
parser: string
xpath: "string(//Name)"
- name: roleIds
type: array
xpath: "number(/Roles/Role/@Id)"
items:
parser: integer
- name: manager
type: object
xpath: "/Manager"
properties:
- name: id
parser: integer
xpath: "/@Id"
- name: name
parser: string
xpath: "/Name"
Properties:
name
Type: String
The name of this input, which any other component will use to refer to it.
type
Must have the value:
"xml"
path
Type: String
The relative or absolute path to the data file on this filesystem.
RODB only reads the input files from the filesystem. Other data sources (NFS, SSH, S3…) should be handled by mounting them in the filesystem using external tools (Docker or Kubernetes volumes, sshfs, s3fs…).
recordXpath
Type: String
An XPath expression that returns a collection of each XML node that should be a record in the resulting data.
The XPath implementation available in RODB is based on the antchfx/xpath
Golang package, whose available syntax is described here.
properties
Type: Array of Object. Empty array not allowed.
Each item in this array describes a different property in the resulting object.
Items of the array
Type: Object
Properties:
name
Type: String
The name of this property. It is the same that is used by the other components of RODB to refer to it. It must be unique in this object.
There are no restrictions to the content of this name. Any unicode string is valid.
type (optional)
Type: String
Defines the data type of this property. Please refer to the details of the other parameters.
xpath
Type: String
An XPath expression that returns the value to assign to this property. The root node of the XPath expression is the node returned by the parent’s XPath expression.
If the returned value is a node, the type
must be object
.
If the returned value is a collection of nodes, the type
must be array
.
Otherwise, the type
must be primitive
(which is the default value).
The XPath implementation available in RODB is based on the antchfx/xpath
Golang package, whose available syntax is described here.
parser (optional)
Type: String
Default value: "string"
This parameter is only valid when the type
parameter is set to primitive
.
The value returned by the XPath expression must be either a string (in which case the value will be processed by the given parser) or of the same type than the parser.
items (optional)
Type: Object
This parameter is only valid when the type
parameter is set to array
.
The definition of this object is the same than the items of the currently described properties
array.
Each node in the collection will be processed and parsed individually using the given definition.
properties (optional)
Type: Array
This parameter is only valid when the type
parameter is set to object
.
The definition of this array is the same than the currently described properties
array.
dieOnInputChange (optional)
Type: Boolean
Default value: true
RODB internally identifies each record using it’s binary offset in the file.
Because of this, any change in the data file while RODB is running can move those offsets, thus corrupting the indexes.
To avoid returning corrupted data, the default behaviour of RODB is to stop the service with an error whenever it happens.
While not recommended, setting this property to false
would prevent RODB from stopping when the data source changes.