Data Fields - Advanced Options
The following sections describe how to define data fields when your report/text file contains more complex formatting than those described in the Defining Data Fields topic. To use the features described in this topic, select Define Data Field from the right-click menu in the Data Panel.
- Tagged Data
- Columnar Data
- Columnar Data - No Heading - Single Line
- Columnar Data - No Heading - Multiple Lines
- Columnar Data - With Heading - Single Line
- Columnar Data - With Heading - Multiple Lines
- Other Data Formats
Tagged Data
Since tagged data occurs frequently in report files, we have included some special ways to handle it in the Data Extractor.
Tagged data can be considered any information in a report or text file where there is a field tag, or field name, followed by the data on one line of text. There is usually some kind of separator between the field tag and the data, such as a colon or a dash. In the Data Extractor, there is a list of tag separators from which to choose in the Source Options Window.
Tagged Data - Fixed Position
This section explains fixed-position tagged data where the field tag and its data occupy the same consistent horizontal position in a line from record to record within the report. Fixed position tagged data may appear in your text or report file in one of two ways:
- Each field tag and its data may be on its own line (as in the Tutorial 1 file TUTOR1.REP).
- There may be two or more field tags and their data on a line.
The procedures in this section are applicable to either of these two kinds of tagged data.
Note: It is not necessary to define the line style before using this procedure. The Data Extractor will use the field tag as a recognition pattern to create one automatically.
In the Data Panel, use the mouse to highlight a field tag, the field separator, and its data on a single line of text. Remember to extend the highlighted portion far enough to the right so wider data in subsequent records is picked up.
Example:
image\tagdata1.gifWith the mouse positioned anywhere in the Data Panel (the large white area), right-click. Select Define Data Field 4 Parse Tagged Data.
The Data Extractor first defines the line style, using the field tag as the recognition pattern and line style name. It then defines the data field by fixed position, starting with the column immediately after the selected tag separator and ending with the last highlighted column. The field tag is also used as the name for this data field.
Use the procedures above to define each piece of tagged data in one record. The matching lines and data fields in every other record of your report should automatically be defined. If they are not, you should edit the line style and/or the data field definition.
For a good example of how to work with tagged data using automatic features, see Data Extractor Tutorial 2 - Tagged Data and Automatic Features.
Tagged List Data - Fixed Position
This section deals with various report data formats that can be described as tagged list data. A tagged list consists of two or more lines of text that contain tagged data in a fixed position with one field tag and data field per line.
Tagged data can be considered any information in a report or text file where there is a field tag or field name followed by the data on one line of text. There is usually some kind of separator between the field tag and the data, such as a colon, a dash, or spaces. In the Data Extractor, there is a list of tag separators from which to choose in the Source Options Window.
Tagged list data may appear in your text or report file in one of three basic ways:
Each is described below along with the procedure for defining it in the Data Extractor.
Variable Inner Margins / Fixed Outer Margins
This type of tagged list contains field tags and data where the field tags are left justified and the data is right justified within the text file.
The first step in defining this type of tagged list is to go to the Script Design Choices tab of the Source Options Window and select the correct tag separator. For this type of format select # of Spaces+ from the Tag Separator list box. Another list box, the Separator Spaces box, displays below the Tag Separator box. Select With Data from this list box to indicate that the spaces between the tag and the data should be collected with the data. This causes the Data Extractor to extend the width of the data fields out to the left to within two spaces of the end of the longest field tag.
Here is an example of this type of a tagged list:
image\parstag3.gifNotice that an entire block of text was highlighted. This highlighted block of data consists of one record of data.
After the block of data is highlighted, right-click in the Data Panel and select Define Data Field > Parse Tagged Data from the menu.
The Data Extractor defines each line style using the field tag found on each line of text within the highlighted block of text. Each line style is given a unique line style name corresponding to the field tag. The Data Extractor also defines each data field on each line, and the field names also corresponds to the field tag.
Variable Outer Margins / Fixed Inner Margins
This type of tagged list contains field tags and data where the field tags are right justified and the data is left justified within the text file.
The first step in defining this type of tagged list is to go to the Source Options Window and select the correct tag separator. For the example below, select ColonSpace(: ) from the Tag Separator list box.
Here is an example of this type of tagged list:
image\parstag2.gifNotice that an entire block of text is highlighted. This highlighted block of data consists of one record of data.
After the block of data is highlighted, right-click in the Data Panel and select Define Data Field > Parse Tagged Data.
The Data Extractor defines each line style using the field tag found on each line of text within the highlighted block of text. Each line style is given a unique line style name corresponding to the field tag. The Data Extractor also defines each data field on each line, and the field names will correspond to the field tag.
For a good example of this kind of data with instructions to parse it, see Data Extractor Tutorial 2 - Tagged Data and Automatic Features.
Fixed Left Margin (Field Tags and Data)
This type of tagged list contains field tags and data where both the field tags and the data are left justified within the text file.
The first step in defining this type of tagged list is to go to the Source Options Window and select the correct tag separator. For this type of format select # of Spaces + from the tag separator list box. Another list box, the Separator Spaces box, will display below the Tag Separator box. Select With Tag from this list box to indicate that the spaces between the tag and the data should be collected with the tag. This will cause the Data Extractor to extend the width of the field tags out to the right to within two spaces of the beginning of the data fields.
Here is an example of this type of tagged list:
image\parstag1.gifNotice that an entire block of text was highlighted. This highlighted block of data consists of one record of data.
After the block of data is highlighted, right click in the Data Panel and select Define Data Field 4 Parse Tagged Data.
The Data Extractor defines each line style using the field tag found on each line of text within the highlighted block of text. Each line style is given a unique line style name corresponding to the field tag. The Data Extractor also defines each data field on each line, and that the field names correspond to the field tag.
Tagged Data - Floating Position
This section deals with tagged data where the field tag and its data occupy different horizontal positions in a line from one record to the next within the report. In other words the tag and data "float" within the line of text.
Tagged data can be considered any information in a report or text file where there is a field tag or field name followed by the data on one line of text. There is usually some kind of separator between the field tag and the data, such as a colon or a dash.
The first step in defining this type of tagged data is to go to the Source Options Window and select the correct tag separator. For the example below, select ColonSpace(: ) from the Tag Separator list box.
Floating tagged data usually appears in your text or report file with multiple fields per line. It is the variable length of the data in these fields that causes them to float.
image\TAGDATA2.gifNotice how the person’s name in the second record is longer than the name in the first record. This makes the end of that field a variable, not fixed, position. The tag for the third data field determines where the second data field ends. Also, the variable size of the second data field forces the third data field (Status) into a different position within the line. Thus, it has a variable beginning. The second data field on the line has a floating end tag, and the third data field has a floating start tag.
To define the AcctNo field:
You can use the Parse Tagged Data option from the right-click menu for the AcctNo: field in this report. But use the following procedure for the "Name:" and "Status:" fields.
To define the Name field:
- Highlight the text Name: Margaret Jones. (Choosing the longer data makes the export field length close to the right length. You might still have to adjust it slightly on the last tab. See step 8.)
- Right-click in the Data Panel and select Define Data Field 4 Parse Tagged Data. This creates a data field that begins two spaces past the "Name:" tag and name the data field "Name". The field becomes colored in the window, but the color is not in exactly the right place. It overlaps the "Status:" field tag in some records.
- Double-click the colored field. The Field Definition window opens. The start rule is Fixed Column. This is fine since this field begins in the same place in each record.
- Click the End Rule tab and select the Floating Tag radio button.
- Type Status: (case sensitive), including the colon, in the box to the right where the cursor is blinking. Since this field is not fixed in length, you need to set a maximum export length for it on the Data Collection/Output tab in the Export FldLength box.
- Click Add. The field returns to black. The Data Extractor cannot color data fields that are not fixed length and fixed position. Any field with a floating tag start or end rule remains in black text in the Data Panel.
- Highlight the text "Status: Closed" with the mouse and continue out to the end of the line.
- Right-click in the Data Panel and select Define Data Field4 Parse Tagged Data. This automatically creates a data field with the correct name, end rule, and export field length. The start rule is not correct. The text of the field changes color in the window.
- Double-click the colored text.
- At the Start Rule tab, select the Floating Tag radio button.
- Type Status: (case sensitive), including the colon, in the box.
- Click Add. The field text returns to black now that it is defined with a floating tag start rule.
It is common for fields to have both a floating tag start rule and end rule.
Note: When you use floating tags to define a data field, the text for those fields do not change to any color except black in the Data Panel. Floating Tag data fields also do not underline in the Data Panel even if you have that option turned ON. To verify the Data Extractor is reading the field contents correctly, open the Data Record Browser Window. To edit the field, highlight a piece of the line, right click and choose Define Data Field > Edit Data Field.
Columnar Data
Since columnar data occurs frequently in report files, we have included special ways to handle various types of columnar data in the Data Extractor.
Columnar data can be considered any information in a report or text file where there are two or more columns of data. It looks like a spreadsheet without the grid lines. There is usually some kind of separator between the columns, such as one or more spaces or a tab. In the Data Extractor, there is a list of column separators in the Source Options Window from which to choose. A single column of data is discussed below in the "Other Types of Data" section.
Columnar data often has column headings above the actual data. The Data Extractor allows you to use the column headings automatically as field names for the data fields. See the two Highlight Columnar Data sections below.
Columnar Data - No Heading - Single Line
In the Data Panel, use the mouse to highlight across two or more columns of data within a single line of text.
Example:
image\colsingl.gifWith the mouse positioned anywhere in the Data Panel, right-click. Select Define Data Field 4 Parse Columnar Data. One of two things happens:
- If the line of text was previously defined with a line style, the Data Extractor defines each column as a fixed position data field with fixed column start and end rules and leaves one or two spaces between each of the data fields. The number of spaces left between the data fields depends on which column separator is selected in the Source Options window. Field names default to LineStyleName_1, LineStyleName_2, etc.>/li>
- If the line of text was not previously defined with a line style, the Data Extractor defines the line style, using either field tags, or special characters, or first field as the recognition pattern. The Data Extractor then defines each column as a fixed position data field with fixed column start and end rules and leave one or two spaces between each of the data fields. The number of spaces left between the data fields depends upon which Column Separator is selected in the Source Options window. Field names default to LineStyleName_1, LineStyleName_2, etc. If nothing is found with which to define the line style, no field tag, special character, or first field, the Data Extractor returns a message indicating this. You need to define a line style recognition pattern manually. Then you can follow these steps again to define the fields automatically.
Note: Columnar fields are automatically set to Flush Field Contents.
Columnar Data - No Heading - Multiple Lines
To highlight a specific block of the text, place the mouse pointer immediately to the left of the first character of data to be included in the highlighted selection. Click and drag to the right until the last character of data is included in the highlighted selection. All lines of text between the two corners are highlighted.
Example:
image\colwohdr.gifIn this example, please note that the upper left corner of the block starts immediately to the left of the word "SALES/MARKETING", so the blank space left of the "S" is not a part of the highlighted text. The lower right corner of the block ends immediately after an empty data field in the last column.
Procedure
- In the Data Panel, use the mouse to highlight across two or more columns and lines of data. Do not include other data that is not columnar.
- With the mouse positioned anywhere in the Data Panel, right-click and select Define Data Field4Parse Columnar Data. One of four things happens:
- If all the highlighted lines of text were previously defined with the same line style, (named Sales for instance) the Data Extractor defines each column on each line as a data field using Fixed Column as the field definition. Field names default to Sales_1, Sales_2, etc.
- If the highlighted lines of text were not previously defined with any line style, the Data Extractor tries to define the line style, using only special characters found in the same columns in every highlighted line as the recognition pattern. If a recognition pattern is built, a line style is defined with the first field on the first line used as the line style name. The Data Extractor then defines each column on each line as a data field using fixed column as the field definition. Field names default to SALESMARKETING_1, SALESMARKETING_2, etc.
- If the highlighted lines of text were not previously defined with any line style, the Data Extractor tries to define the line style, using only special characters found in the same columns in every highlighted line as the recognition pattern. If no special characters are found, the Data Extractor cannot build a recognition pattern. A message box appears informing you that the Data Extractor "Couldn't add line definition, no recognition pattern found". If you get this message, must manually define a line style that recognizes all the detail lines. For details, see Defining Line Styles. After the line style has been defined, you can use the Parse Columnar Data option to define the fields.
- If the highlighted lines of text are defined with different line styles on various lines of the highlighted text, attempt to define every line with the same line style before selecting the Parse Columnar Data option. Otherwise, the Data Extractor creates a new line style (if it can) and adds the data fields to it. In this case, you must use the ReOrder Line Styles option to position the Data Extractor-created line style ahead of the previously defined line styles.
Note: Columnar fields are automatically set to Flush Field Contents.
Columnar Data - With Heading - Single Line
To highlight a specific block of the text, place the mouse pointer immediately to the left of the first character of data to be included in the highlighted selection. Click and drag to the right until the last character of data is included in the highlighted selection. All lines of text between the two corners are highlighted. image\colhsing.gif
Procedure
- In the Data Panel, highlight across two or more columns of data within a single line of text, and also include any column headings that may appear above the data.
- With the mouse positioned anywhere in the Data Panel, right-click and select Define Data Field 4Parse Columnar w/Heading.
- Type the number of lines occupied by the column heading and a number of lines for the Data Extractor to skip.
- If there is a line of dashes, asterisks, spaces, or other non-alpha characters immediately below the column heading, do not include this line when counting the Header Lines. Type the total number of heading lines in the box to the right of Header Lines.
- If there is other information or a blank line below the column heading that is not part of the column heading such as a line of dashes, asterisks, spaces or other non-alpha characters, count these lines, including blank lines and alpha lines, and type the total number in the box to the right of SKIP Lines. The Data Extractor calculates the number of data lines.
- Check to make sure it is correct, and then click OK in the Number of Header Lines dialog box. At this point, one of two things happens:
- If the line of text was previously defined with a line style, the Data Extractor defines each column as a fixed position data field with fixed column start and end rules. Each data field defaults with a field name similar to the column heading. For more details, see Field Names.
- If the line of columnar text was not previously defined with a line style, the Data Extractor defines the line style, using either special characters or first field as the recognition pattern, and then defines each column as a fixed position data field with fixed column start and end rules. Each data field defaults to a field name similar to the column heading. For more details, see Field Names.
Columnar Data - With Heading - Multiple Lines
To highlight a specific block of the text, including headers, place the mouse pointer immediately to the left of the first character of a header to be included in the highlighted selection. Click and drag the down and to the right until the last character of data in a detail line is included in the highlighted selection. It is not necessary to highlight all of the detail lines that will be parsed, but you want to be sure to highlight far enough to the right so data in the last column is not truncated in lines you may not be able to see in the display area. All lines of text between the two corners are highlighted.
Example:
image\colwhdr.gifProcedure
- In the Data Panel, highlight across two or more columns and lines of data, starting with a column heading in the upper left corner and ending with a detail line in the lower right.
- With the mouse positioned anywhere in the Data Panel, right-click and select Define Data Field > Parse Columnar w/Heading.
- Type the number of lines occupied by the column heading and a number of lines for the Data Extractor to skip:
- If there is a line of dashes, asterisks, or other non-alpha characters immediately below the column heading, do not include this line when counting the header lines. Type the total number of heading lines in the box to the right of Header Lines.
- If there is other information below the column heading that is not part of the data or the column heading such as a line of dashes, asterisks, or other non-alpha characters, include those when counting the skip lines, count these lines, including blank lines and alpha lines, and type the total number in the box to the right of SKIP Lines.
- The Data Extractor calculates the number of data lines. Check to see that it is correct. Click OK in the Number of Header Lines dialog box. At this point, one of four things happens:
- If all the highlighted lines of text were previously defined with the same line style, the Data Extractor defines each column as a fixed position data field with fixed column start and end rules, and on each of the highlighted lines of text. Each data field defaults to a field name similar to the column heading. For more details, see Field Names.
- If the highlighted lines of text were not previously defined with any line style, the Data Extractor tries to define the line style, using only special characters found in the same columns in every highlighted line as the recognition pattern. If a recognition pattern is built, the Data Extractor then defines each column as a fixed position data field with fixed column start and end rules on each of the highlighted lines and any other lines in the report that meet the line style recognition rules. Each data field defaults to a field name similar to the column heading. For more details, see Field Names.
- If the highlighted lines of text were not previously defined with any line style, the Data Extractor tries to define the line style, using only special characters found in the same columns in every highlighted line as the recognition pattern. If no special characters are found, the Data Extractor cannot build a recognition pattern. A message box appears informing you that the Data Extractor could not find anything by which to define the line styles. If you get this message, manually define the line styles. For details, see Defining Line Styles.
- If the highlighted lines of text are defined with different line styles on various lines of text, try to define every line with the same line style before selecting the Parse Columnar Data option. Otherwise, the Data Extractor creates a new line style, if it can, and adds the data fields to it. In this case, you must use the Re-Order Line Styles option to position the Data Extractor-created line style ahead of the previously defined line styles or delete the old line styles.
For a good example of how to work with columnar data with the automatic features, see Data Extractor Tutorial 5 - Columnar Data with a Footer.
Other Data Formats
If the data in your report or text file is not formatted as tagged or columnar data, or if you need to extract data out of header and/or footer lines, read the information in this section.
Selected Text - Single Line - With Continuation
For reports that contain data in a paragraph or other long multi-line format, it is necessary to define a continuation rule. This causes all the data to be gathered in a single data field.
The steps in this section are based on the assumption that you have already defined the line style for the line of text on which you now want to define data fields. For information on how to define line styles, see Defining Line Styles.
Define a line style that recognizes the first line of the data only. Highlight the data just on that one line.
image\tutor6example.gifProcedure
- Highlight the data on the first line, the one youdefined with a line style.
- With the mouse positioned anywhere in the Data Panel, right-click and select Define Data Field4New Data Field.
- Give the data field a meaningful field name, if desired.
- Fixed Column is the default selection at the Start Rule and End Rule tab. Change these values, if necessary.
- Click the Continuation tab, and select the continuation rule that is appropriate to your report format. Until Next Line Style is probably the most appropriate for the example above since the Remarks field could continue an indefinite number of lines and it is not known if the Unit Price line style will always be the line style following it.
- Since this field will not be fixed length, on the Data Collection/Output tab, set the Output FldLength for export purposes.
- Click Add. The Field Definition window closes unless you have turned Close Definition Dialogs on Add/Update OFF in the Preferences menu.
For details about each Continuation option, see Field Definition Window.
The field is not fixed length. The first line is fixed length and position and is colored and underlined. The rest of the field is not.
Header Lines
When header lines of text are present in your report file, some of them may contain information you want to extract and define as a data field in each record. For example, there may be a report title or report date that you want as a data field in each record in your target file.
You should first define a line style for any header line that contains data you want to extract. The COLLECT Fields line action is usually what is needed. Then in each of those lines, define each data field you want to extract. Continue defining the rest of the lines and fields.
You might also want to reposition the data fields prior to exporting the data. Open the Export Field Order Window by selecting Field > Export Field Layout and moving the data fields to the desired position.
Footer Lines
When footer lines of text are present in your report file, some of them may contain information you want to extract and define as a data field in each record.
If your data does not contain detail lines that would normally be ACCEPT Record lines, make the footer the ACCEPT Record line and proceed normally.
If the previous lines are detail lines, select COLLECT Fields as the line action rather than ACCEPT Record, as usual. For each data field within the detail lines, go to the Data Collection tab in the Field Definition Window and turn Array Field ON for each field.
Proceed to the footer section and define the line or lines that contain the desired data. Select ACCEPT Record as the line action for the last defined line style. When you define the data fields on the footer lines, do not turn Array Field ON for these fields. For an example of arrayed fields with a footer line, see Data Extractor Tutorial 5 - Columnar Data with a Footer.
You may want to reposition the data fields prior to exporting the data. Open the Export Field Order Window and move the data fields to the desired position.
Your transaction secured by high-grade AES-256 encryption.