importData

This method creates a database table and the defined columns, then imports values from a CSV file.

CSV columns are mapped to table columns and a range of filters and formatters can be applied to rows and values. It is also possible to create columns not present in the CSV file.

Naming

All table names are automatically prefixed di_ (data import). Although, if you provide a table name that starts with di_ the prefixing is skipped - you won't end up with di_di_tablename.

Import Version Numbers

When you first perform an import the version number is recorded in the table's di_version column as version 1. Subsequent imports into the same table create additional rows with the version number incremented by 1. The purgeImportData method can be used to tidy up old versions.

Appending Data

It is possible to append data to an existing import without increasing the version number. The existing data must have been imported with allowAppend: true. Subsequent imports should set the appendTo parameter, which is the di_version of the existing data that the import should be added to.

Metadata

Two tables hold metadata for the imports:

di_meta_tables_v1 records the names and latest version of each data import table
di_meta_versions_v1 records the table name, version, label, import start time and import complete time of each data import. The version number increases each time (non appended) data is imported into a table. The importcompleted column will be NULL if allowAppend: true when the import is performed

Parameters

Name	Type	Description
tablename	String	Required. The name of the table to create. All table names will be automatically prefixed di_
file	String	One of file or filestoreid must be provided. The path to a local file (local to the server)
filestoreid	String	One of file or filestoreid must be provided. The fileID of a file held in the platform's filestore
filecolumns	Array of objects	Required. Each object represents a column in the CSV file and holds the column name, index, the tablecolumn mapping etc, see below for more detail
tablecolumns	Array of objects	Required. The columns to create. Each object includes the column's name, type, size etc, see below for more detail
datasource	String	Optional. If not provided the datasource will be set as iCM datasource appended with _custom, eg icm_custom. The importer is designed to be used with its own datasource and the name of that datasource should be provided with each request - speak to your account manager about getting a datasource set up
label	String	Optional. A label to identify the import in the di_meta tables. If not set the file or filestoreid will be used
columnHeader	Boolean	Optional, default: true. Whether or not the CSV has column headers. If it doesn't have headers, set the col property in the filecolumns instead, see below
rowFilter	String	Optional. A JavaScript statement to filter the CSV rows as they are imported, see below
delimiter	String	Optional, default: ","
batchsize	Integer	Optional, default: 50. Set a value between 50 and 1000
appendTo	Integer	Optional. An existing import version number to append this data to. If not set the data will be imported with a new version number. allowAppend must be set as true in the data you would like to append to
allowAppend	Boolean	Optional, default: false. Set true to allow future imports to be appended to this import version
sqlCopy	Boolean	Optional, default: false. Whether or not to use the PostgreSQL COPY FROM command when performing the import. The file must come from the filestore, or be local to the database server. You may like to experiment with this and compare the performance of the import

File Columns

The filecolumns parameter is an array of objects where each object represents a column in the CSV file.

For example, this represents some data held in a CSV about cars:

"filecolumns":[ {"name":"ID","tablecolumn":"ID"}, {"name":"MODEL","tablecolumn":"Model","nullvalue":"","formatters":"trim"}, {"name":"COLOUR","tablecolumn":"Colour","nullvalue":"","formatters":"trim,lower"} ]

Each object has the following properties.

Property	Type	Description
name	String	Required. The name of the column that's being imported, if the CSV file has a header row and columnHeader: true. If the CSV doesn't have a header row this name is still used in the rowFilter. It's possible to read a column from the CSV multiple times and insert the values into different table columns, perhaps with different formatters. You can also set the name as "NULL" to create table columns that are not present in the CSV file, useful if you want to insert empty columns or, for example, the current date using the "now" formatter
col	Integer	Optional. If the CSV file doesn't have a header row, this is the index of the column to read data from. If you set both a name and col, the col is used
tableColumn	String	Required. The table column to import into. Must be defined in the tablecolumns array, see below
nullValue	String	Optional. If this value is found in the data it will be inserted as null
formatters	String	Optional. One or more named formatters (comma separated) to apply to the data

Formatters

The following formatters can be applied to rows as data is imported.

Name	Description
now	Insert the current date and time as "yyyy-MM-dd HH:mm:ss"
lower	Insert the value in lowercase
upper	Insert the value in uppercase
trim	Insert the value with whitespace trimmed from both ends
ltrim	Insert the value with whitespace trimmed from the start
rtrim	Insert the value with whitespace trimmed from the end
capitalise	Insert the value with the first letter capitalised (other letters are first lowercased)
quote	Insert the value wrapped in double quotes
singlequote	Insert the value wrapped in single quotes
nospace	Insert the value with spaces removed
round	Insert the value rounded to the nearest integer

Row Filters

Row filters are JavaScript statements. Only rows in the CSV file that match the filter will be imported.

The statement is provided with variables derived from the name field in each filecolumn.

Note than any non-alphanumeric characters in column names are replaced with underscores before being passed to the filter engine. Be careful with case sensitivity. For a name of MAG the filter could use MAG, Mag or mag, but not mAg or maG.

For example, this filter would insert rows that have either a value in the "Mag" column that is less than 5 and a value in the "Distance" column that is less than 200, or a value in the "Mag" column greater than 30:

"rowFilter": "(Mag < 5 && Distance < 200) || Mag > 30"

This filter only imports rows that have "Yes" in the "In use?" column:

"rowFilter": "IN_USE_ == 'Yes'"

Table Columns

The tablecolumns parameter is an array of objects where each object represents a column that will be created in the new database table. Note that a di_version column is automatically created in every table as described above.

Property	Type	Description
name	String	Required. The name of the column
type	String	Required. The data type
size	Integer	Required if type is VARCHAR
primaryKey	Boolean	Optional. Whether or not this column is the primary key
nullable	Boolean	Optional. Whether or not the column allows null values
index	String	Optional. If set the data will be added to an index of this name
indexorder	String	Optional. The position in the index, eg "1"
include	String	Optional. The names of other table columns to include as non-key columns in the index

Examples

This example imports data from four named columns in the CSV file into four columns in the table. It will be imported as a new version and wouldn't allow future imported data to be appended to it.

{ "id": 1234567, "method": "importData", "params": { "datasource": "my_datasource", "tablename": "di_cars", "filestoreid": "fe7bfe91-a4f1-446f-8678-0dc8657f0a64", "filecolumns": [ {"name": "ID", "tablecolumn":"ID"}, {"name": "MANU", "tablecolumn": "Manu", "nullvalue": "", "formatters": "trim"}, {"name" :"MODEL", "tablecolumn": "Model", "nullvalue": "", "formatters": "trim"}, {"name":"COLOUR", "tablecolumn": "Colour", "nullvalue": "", "formatters": "trim"} ], "tablecolumns": [ {"name": "ID", "type": "INTEGER", "primaryKey": true}, {"name": "Manu", "type": "VARCHAR", "size": 60}, {"name": "Model", "type": "VARCHAR", "size": 25}, {"name": "Colour", "type": "VARCHAR", "size": 25} ], "label": "Cars" }, "jsonrpc": "2.0" }

This example imports postcode data. Only rows that are "in use" will be imported. The import uses column indexes as there are no column headers. Any empty strings in the CSV will be inserted as NULL.

{ "id": 1234567, "method": "importData", "params": { "datasource": "my_datasource", "tablename": "postcodes", "filestoreid": "9b181ce4-e3f8-4c9f-913f-d652b176fe89", "columnHeader": false, "filecolumns": [ {"name": "Postcode", "col" :1, "tablecolumn": "Postcode"}, {"name": "Inuse", "col": 2, "tablecolumn": "Inuse"}, {"name": "Easting", "col": 5, "tablecolumn": "Easting", "nullvalue": ""}, {"name": "Northing", "col": 6, "tablecolumn": "Northing", "nullvalue": ""}, {"name": "Latitude", "col": 3, "tablecolumn": "Latitude", "nullvalue": ""}, {"name": "Longitude", "col": 4, "tablecolumn": "Longitude", "nullvalue": ""} ], "tablecolumns":[ {"name": "Postcode", "type": "VARCHAR", "size": 25, "primaryKey": true}, {"name": "InUse", "type": "VARCHAR", "size": 5}, {"name": "Easting", "type": "INTEGER", "nullable": true}, {"name": "Northing", "type": "INTEGER", "nullable": true}, {"name": "Latitude", "type": "FLOAT", "nullable": true}, {"name": "Longitude", "type": "FLOAT", "nullable": true} ], "rowFilter": "INUSE == 'Yes'", "batchsize": 500, "label": "PL_postcodes" }, "jsonrpc": "2.0" }

This final example adds the imported data to an existing version and creates columns in the table which don't exist in the CSV file (the three NULL columns) one of which inserts the date and time of the import.

{ "id": 1234567, "method": "importData", "params": { "datasource": "my_datasource", "tablename":"stars10", "filestoreid":"4f0bba11-6a8c-4cd9-8381-1e46f4aedb92", "columnHeader":true, "filecolumns":[ {"name": "StarID", "tablecolumn": "ID"}, {"name": "PROPERNAME", "tablecolumn": "StarName", "nullvalue": "-"}, {"name": "RA", "tablecolumn":" RightAsc"}, {"name": "DEC", "tablecolumn": "Declination"}, {"name": "DISTANCE", "tablecolumn": "Distance"}, {"name": "MAG", "tablecolumn": "Magnitude"}, {"name": "SPECTRUM", "tablecolumn": "Spectrum1", "formatters": "trim,lower"}, {"name": "SPECTRUM", "tablecolumn": "Spectrum2", "formatters": "lower", "nullvalue": "-"}, {"name": "NULL", "tablecolumn": "ImportedLocE"}, {"name": "NULL", "tablecolumn": "ImportedLocN"}, {"name": "NULL", "tablecolumn": "ImportedWhen", "formatters": "now"} ], "tablecolumns":[ {"name": "ID", "type": "INTEGER", "primaryKey": true}, {"name": "StarName", "type": "VARCHAR", "size": 60, "nullable": true}, {"name": "RightAsc", "type": "FLOAT", "index": "radec"}, {"name": "Declination", "type": "FLOAT", "index": "radec"}, {"name": "Distance", "type": "FLOAT"}, {"name": "Magnitude", "type": "FLOAT"}, {"name": "Spectrum1", "type": "VARCHAR", "size": 30}, {"name": "Spectrum2", "type": "VARCHAR", "size": 30, "nullable": true}, {"name": "ImportedLocE", "type": "INTEGER", "nullable": true}, {"name": "ImportedLocN", "type": "INTEGER", "nullable": true}, {"name": "ImportedWhen", "type": "DATETIME"}], "rowFilter": "(Mag < 5 && Distance < 200) || Mag > 30", "batchsize": 20, "appendTo": 6, "allowAppend": true }, "jsonrpc": "2.0" }

Last modified on 6 September 2022

This website uses cookies