IVI: Scanning

IVI

Scanning, Counting and Replacement

IVI has a range of facilities for locating and counting particular strings, records, and so on. These are described in this webpage.


Command: FInd

Purpose: Initiate a scan, starting at the cursor position, for some occurrence defined by the parameters.

Format: FI <digit>|<scan string>

Parameters: The scan can range from a simple search for a particular string to a complex search for a record having a set of properties. <digit> selects one of seven different types of scan. The interpretation of <scan string> depends on the type selected.

In general, the scan starts at the current position of the cursor, continuing until the first instance of the specified occurrence is located. If successful, the cursor is placed at the start of the occurrence; otherwise it is left where it is. In either case, IVI returns to TYPING/INSERT mode.

The top element of the result stack is set to 1 or 0 to indicate success or failure, and is displayed on the second-to-last line of the screen. Unfortunately, in this case (as distinct from Find restart), it is overwritten before you get to see it!


Command: SCanset

Purpose: Set the scan parameters for subsequent Find restart, Number or Check operations

Format: FI <digit>|<scan string>

Parameters: The parameters are the same as those for FInd. SCanset stores them, but does not trigger an actual scan, nor does IVI return to TYPING/INSERT mode.


Types of Search

Types 0 to 4 are similar to one another. In each case <scan string> is the string being sought, and the search algorithm first attempts to match it starting at the cursor. For type 0, the search then proceeds on a character-by-character basis, so that the string will be found wherever it occurs in the file from the cursor on downwards.

For type 1, on the other hand, subsequent matches will be attempted beginning at the start of each word; so the string will be located only if it starts a word. For type 2, subsequent matches are attempted in column 1 of each line; for type 3, at the start of each sentence.

For type 4, the scan proceeds on a character-by-character basis, but backwards from the cursor. (The string itself, however, occurs in the text in the usual left-to-right order.) There are no backwards scans corresponding to types 1 to 3.

The type 5 scan is somewhat different. <scan string> is a Boolean expression template, either numerical or string, in the form described in Editing Records: Extended Check. The search starts with the current record, regardless of the cursor position within it, and looks for a record for which the resulting expression is true.

In a type 6 scan, <scan string> is a ternary pattern of 0's, 1's and ?'s. The search is to locate a record (in one of the record views) or, on a word-by-word basis, a text position (in text view), whose extended check string matches the ternary pattern. Further details are given in Editing Records: Extended Check.

Beware: braces {} occurring in <scan string> and all characters within them are translated as control characters in the manner described in Functions, etc. These are not matched directly to the same characters in the corefile, but are interpreted to affect the matching process. See below.

Examples:

FI 0|xyzw
      - find the next occurrence of "xyzw" (starting at any position)
FI 1|xyzw
      - find the next occurrence of "xyzw" (at the start of a word)
FI 1
      - the "null string" matches anywhere, so this returns with the cursor where it was. However, the Find restart will now find the start of the next word.
FI 2|fred
      - find a line beginning "fred" (in column 1)
FI 3|Once upon
      - find a sentence starting "Once upon"
FI 4|ing
      - find the first occurrence of "ing" before the cursor position. (Note: find "ing", not "gni")
FI 5|#2<0.7!#3=#5+#7
      - find a record with field 2 numerically less than 0.7 OR field 3 numerically equal to field 5 plus field 7 (reminder: the exclamation point represents the Boolean OR operator).
FI 5|$#1=ab\de
      - find a record in which field #1 begins with string "ab" OR string "de".
FI 6|1?011?1
      - find a record whose extended check string matches "1?011?1"

Control characters

We noted above that control characters can be included within a <scan string> by typing {A} for <CTRL A>, and so on. In scans of types 0 to 4, five of these are interpreted to have special meanings (the rest are ignored):

<CTRL A> - match any sequence of characters 0 to 20 in length.

<CTRL C> - match any single character.

<CTRL E> - match one or more word-end markers (space, punctuation, end-of-line).

<CTRL N> - invert the result of FInd.

<CTRL W> - match 0 to 20 characters excluding word-end markers.

All except the fourth are sometimes known as "wildcard" characters. There ought to be control characters which indicate whether matches are to be case-dependent or case-independent, but these have not been implemented.

Control characters may also appear in the individual strings in the string Boolean of case 5.

A string may not contain two control characters in succession.

Examples:

FI 0|de{A}d
      - match any sequence of characters beginning with "de" and ending with "d" (the intervening string must have 20 or fewer characters).

FI 1|pr{W}d{E}
      - match any word beginning with "pr" and ending with "d". If {W} were replaced by {A}, "pr" could be in one word, "d" in another.

FI 1|{N}the
      - find a word that does not begin 'the'.


Operation: Find restart

Purpose: (Re)start a scan, using the parameters last set by FInd or SCanset

Default Key: <CTRL N>

Operation Number: 32

Before the scan resumes, the cursor is moved to avoid finding the occurrence just located. As in the case of FInd, the result is put in the result stack, and the cursor moved if the search succeeds. Happily, in this case, you actually get to see the result on the second-to-last line of the screen.

Note the subtle difference between FInd and Find restart in that the latter moves the cursor before searching. The former therefore finds the item if the cursor is already on it; the latter does not. This implies that SCanset followed by Find restart is not exactly the same as FInd.


Operation: Number

Purpose: Count the number of specified occurrences from the cursor position to the end of the corefile (the start for type 4), using the parameters last set by FInd or SCanset.

Default Key: <ESC> P

Operation Number: 63

The result is left in the result stack, and displayed on the second-to-last line of the screen.


Operation: Check

Purpose: Check for the presence of the specified occurrence at the cursor, using the parameters last set by FInd or SCanset.

Default Key: <ESC> R

Operation Number: 64

The result is left in the result stack, and displayed on the second-to-last of the screen. The operation is really intended for use in functions, procedures and driver programs. (See Functions, etc.)

What is the relation between the Check of type 5, the Extended check and the Check of type 6?

The Check of type 5 takes a single Boolean expression template and returns either 0 or 1, in effect a single binary digit, as result.

The Extended check takes a sequence of Boolean expression templates, and returns a sequence of binary digits: an extended check string.

The Check of type 6 also computes the extended check string, but compares it to a ternary pattern to produce a single binary digit as result.

Note, by the way, that in a Check of type 5, as in a FInd command of type 5, the individual strings in a string Boolean may contain control characters. A simple trial suggests that this is not the case for a Check of type 6 and an Extended check, but in fact it is. The problem is that character sequences such as {W} in corefile 18 are not translated into <CTRL W>. In short, the mechanism works, but the user must find some alternative way to get control characters into the Boolean expressions in the corefile.


Command: CHange

Purpose: Replace one string by another everywhere starting at the cursor position

Format: CH <string>|<string>

Parameters:The first string is the one to be replaced (the source string), the second, the replacement (the destination string).

Be careful that this is really what you want to do because it is often irreversible. (If you replace all instances of "the" by "a", you cannot undo this when you realize that "there" has been changed to "are"!) If you are uncertain, use a two stage operation:

Obviously, stage 1 can be reversed if necessary.

After any replacement, the cursor is placed (internally) on the character following the destination string, and the scan resumes from here. This means that the destination string itself is not re-scanned, and if it happens to contain the source string, no loop occurs. After the final replacement (if any), the cursor is left on the character after the last replaced instance. The editor remains in COMMAND mode.

The source string must not contain wildcards. So you cannot change the word "the" in the same way that you can FInd it. You can use the source string " the ", but this is not the only way in which the word "the" can occur. It may be at the start or end of a line, it may be capitalized at the start of a sentence, and so on.

The destination string, but not the source, may be empty; in which case all instances of the source are deleted.


Return to IVI Reference Manual


Last revision: January 23, 2002.