Recognition Rules

The recognition rule portion of a line style definition contains criteria by which the Data Extractor identifies a line of text. In other words, you tell the Data Extractor how to recognize a line or lines of the report by defining a set of criteria. After you define a line style in one section of your report, the Data Extractor compares each line of text in your entire report file with that recognition rule. For each line of text that matches the recognition rule, the Data Extractor assigns that line style. The line style name displays in the Line Style column to the left of the Data Panel for each matching line of text in the report.

The trick to defining a good recognition rule is to make it specific enough to NOT include any lines you do NOT want recognized, and broad enough to include ALL the lines you DO want recognized.

You may define line styles manually or let the Data Extractor automatically define them, depending upon whether or not your data can be handled by the Data Extractor's automatic features. For details about this option, see File Menu and Pop-up Menus.

Note: You may want to utilize the more advanced features after becoming familiar with the basic procedures. Tutorial 1 will help you get acquainted with the basics of the Data Extractor, and Tutorials 2 and 3 will introduce you to some of the time saving advantages of the Advanced features.

If you are defining a line style manually, the Data Extractor suggests a recognition rule based on a pattern that displays in the Line Style Definition window when it opens. If you have highlighted a particular portion of the line, this portion is automatically suggested to create the recognition rule. You may modify the suggested recognition rule before adding it to the Data Extractor database and script. Details about different ways to define line styles are found in this documentation.

The manner in which you define line styles depends on a number of factors, including your own personal approach to a task. The other major factor is the type of text or report file with which you are working. The sections below should help you determine the best approach.

Remember the selections in the Source Options window should be examined and possibly modified prior to defining line styles. For details about the available options, see Source Options Window.

Recognized by

When you define a line style, you are specifying a recognition rule that the Data Extractor uses to identify any line of text in your file that matches that rule. Recognition rules are built in the Line Style Definition window. Each line style recognition rule is based on one of seven basic recognition styles. The available styles are described below.

Each style consists of an expression that specifies a search criterion. To see all the available options for building recognition rules, see Line Style Definition Window.

The following are some common, but brief, examples of where and how to use each of the seven basic Recognition styles:

Pattern

If you see that there is a unique string of text on one type of line that does not appear on any other lines, but always appears on that type of line, highlight that text and it becomes the pattern that the Data Extractor uses to identify that line in each record. In some cases, the unique string of text may even be a single character in a consistent position.

This is the most common style to use for a recognition rule. It offers the most flexibility, but can also be the most difficult to define. Patterns are defined by either single-row or multiple-row expressions. Recognition patterns are described in more detail later in this documentation.

Relative Position

If the line of text you are defining appears in the same relative position to some other line, a Base Line, you might define the recognition rule by its relative position to that other line. The Relative Position option allows you to specify one or more lines above or below the Base Line. The line of text you want to use as a Base Line must be defined before you define a line style that refers to it.

Exact Line Number

If the line of text you are defining appears on only one line of text in the report, you might define the recognition rule by its Exact Line Number. This option is very rarely used because it can only be used to recognize lines that occur only once in a report.

Blank Line

If you need to define the lines of the report that do not contain any text, you can define the recognition rule for them by Blank Line.

It is not necessary to define the blank lines in your report except when you need to use them as Base Lines, Accept lines, or markers of some kind.

All Undefined Lines

After you have defined all the necessary lines of text in your report, you may want to use this recognition rule to define all of the remaining undefined lines. Alternately, if all but a few of the lines of your report need to be recognized in the same way, you may want to define the lines you do not want, and then use All Undefined Lines to define the lines that you do want. Another reason you might want to define all undefined lines is for use in the Debug Extract Design Window.

The default line action for this option is COLLECT Fields, but you may select any Action that fits your needs.

Note: It is not necessary to define the undefined lines.

Pattern & Relative Position

Select this option when you want the Data Extractor to define a line style based on its relative position to another line AND by some specific characters or types of characters. Before selecting this option you must have already defined another line to use as your Base Line.

The recognition rule options for Pattern and for Relative Position, as well as special options for using both, are described in detail in this documentation.

The default line action for this option is COLLECT Fields, but you may select any Action that fits your needs.

Non-Blank Line

Select this option when you want the Data Extractor to use this line style to define all lines in the report that contain anything other than spaces and an end of line character.

There is no recognition rule for this option. Its behavior is automatic.

More About Line Styles

Once you have defined a line style and added it to the script, the line style name displays in the Line Style Column to the left of all lines that match that recognition rule.

If a line style name appears on any line that should not be included, select Edit Line Style and make the recognition rule more specific so the unwanted lines of text do not meet the recognition rule's criteria.

If a line style name does not appear on any line that should be included, select Edit Line Style and make the recognition rule broader so the needed lines of text match the recognition rule's criteria.

If the name of the line style you just defined does not appear on any line at all in the Line Style Column, then the recognition rule does not match any line in the report. Place the mouse in the Line Style Column of any blank line and double-click, or select Line > Edit Line Style. When the Line Style Definition window opens, select the line style name you want to modify from the Line Name list box, and edit the recognition rule so the expected lines of text meet the recognition rule's criteria.

Modify a Recognition Rule

If you need to modify the recognition rule of a line style, highlight any part of a line. Then select Line Style Column4 > Edit Line Style, or place the mouse pointer on a line style name in the Line Style Column and double-click. When the Line Style Definition window opens, modify the recognition rule and Update it.

How the Data Extractor Builds Recognition Patterns

The following sections describe how the Data Extractor builds recognition patterns.

New Line Style

When you select New Line Style from the menu, the Data Extractor creates a suggested line style recognition pattern that displays when the Line Style Definition window opens. For details, see Recognition Rules. If you highlighted a piece of the data on the line before making the selection, the Data Extractor uses the data you highlighted to form the recognition rule.

If the pattern meets your needs and the default line style name is acceptable, select a line action and then click Add to accept the recognition pattern. See Line Action. The Line Style Definition window closes unless you have turned Close Definition Dialogs on Add/Update OFF in the Preferences menu.

If you need to change the pattern or want to change the line style name, you may enter a new name or modify the pattern, then select a line action, and click Add. The Line Style Definition window closes unless you have turned the Close Definition Dialogs on Add/Update OFF in the Preferences menu.

Auto New Line Style

When you select Auto New Line Style from the menu, another menu prompts you to select a line action. See Line Action. After you select a line action, the Data Extractor creates a line style recognition pattern automatically.

The default line style name displays in the Line Style Column for each line of text in your report file that matches the recognition pattern.

If you need to modify the line style recognition pattern or the line style name, double-click the line style name in the Line Style Column to open the Line Style Definition window and edit the line style recognition pattern and/or name. After making the desired modifications, click Update. The Line Style Definition window closes unless you have turned Close Definition Dialogs on Add/Update OFF in the Preferences menu.