eggbird
software development and
knowledge engineering

Mail

ClaM - A Classification Manager

Importing a classification

The import function is meant to be used for importing well defined structured text files, which must be character encoded as UTF-8 before import. The present version of ClaM supports several formats. The latest CEN EN14463:2007 (ClaML format ), and 'delimited text format'. The latter format needs to be parsed through a converter to transform it in the correct internal format.

  1. Select Import from the File menu.
  2. Select the file you want to import.
  3. Press the Open button.

NOTE: If you want to import a series (chapters) of files, give them a base name plus appropriate consecutive numbers/letters and use the wildcard characters * and ? to specify the importfilename(see below).
For example "F:\\sources\classifications\icd10-nl-2010-ch*.xml lets you import the chapters 1-22 of the ICD 10.

Conversion

When your classification is not yet available in ClaML, you can use this Import function to read your classification into ClaM. It reads text files and has many options that can tune it to the way your classification is represented in the text. There are two basic input formats:

In the two column form the source file can be parsed into ClaM in one single run. In the Code specification you have to fill in Field 1 and leave all the others blank or 0. If the code is in itself hierarchical, you can select that, which instructs ClaM to build the hierarchical relations at import. For the box Preferred (main) rubric, you fill 'preferred' as kind, and select Field number 2 for the value. For the lines after the line containing the Code 'preferred' is replaced by the mapping as defined in the box 'Other rubrics..'. This takes care for translating Innefattar from the source file into inclusion in the internal representation.

In the single line multicolumn input form you can import only one code rubric combination at a time. so more runs are necessary to build the classification. Codes and terms can both be represented in sources as multiple fields(=columns of your table). In converter you can select up to three fields to construct both code and term. If only one field is needed, leave the others blank or 0. You can separate the imported fields with a space by selecting the checkboxes between the three field number selectors. First specify which field(s) of your table contain the code. Next specify the field number(s) that contain the the preferred rubrics. If all is specified hit OK for the first run. Repeat this process until all information you need has been imported. The Code specification can remain the same in the subsequent runs. NOTE: Do take care to fill the box 'Preferred (main) Rubric' with the aproriate Rubrickind (inclusion, exclusion) and corresponding Field (column) number.

Converter

How to use Converter

The maximum length of a rubric the Converter can read is limited to 400 characters. During the conversion, the lines containing rubrics that are too long are reported. In such a case, you may split such rubrics over a number of lines. The first of these lines should contain in the code field the kind of rubric, and the remaining lines should contain the character '+' instead of the rubric kind.

Import of META tags

As of ClaM 7.30.03 the import of META tags at a class is supported. Select "Import meta information insted of rubric" on the input form. Import is similar as for Rubrics. Please bear in mind that a META tag can have only one value at one specific class.

Example Two column multi-line input

Suppose you want to convert the following file Example.txt:

A01;the rubric for A01
incl;an inclusion rubric for A01
incl;and one more inclusion rubric for A01
excl;here is an exclusion rubric for A01
+;spread across three
+;lines
A02;the rubric for A02

Miscellaneous

Import of XML and HTML tags

By default some characters like < and > are translated into a so-called web safe representation. This would make import of HTML and XML tags in otherwise plain text impossible. As of version 7.30.03 Clam supports the import of these tags. Just place a "\" before the < or >. For example:

arthritis+ (\<Reference\>M01.3*\</Reference\>)