Capri Download

Capri: capri.rar
Sample Data: data_samples.rar


Java class files are provided in a downloadable rar file. Extract the class files in a directory. Install the Java run-time library of version 1.6 and make sure that it is on the system path.


java Capri [inputfile] [outputfile]


The Capri tool is used to mine structural line patterns in a type-casted format from single and multi-line semi-structured log data files. By default it generates a text output file "LTVPatterns.txt" where it lists the line patterns, frequent term and value patterns and the association rules which show the contextual relationships between pairs of line patterns. The tool prompts for input and output files, frequent line, term and value support thresholds, and support and confidence thresholds for the rules. Only input file name is mandatory and the others are optional. If output file name is provided, the tool generates a similar copy of the input file with the exception that each line is labeled with a line number and a pattern id to facilitate the next phase of data processing. For large input files, output file name can be omitted to avoid regenerating the whole data. The tool extracts frequent, rare and interesting (containing 3 or more consecutive symbols in a word) line patterns. For mining frequent line, term and value patterns, it requests minimum support threshold for each of these and uses a default value if the user does not enter a value. Typically a rule is generated for every possible pair of line patterns that appear in the data file. To minimize the rule set, user can enter a minimum support and confidence threshold for the rules. A default value is set if the user does not enter a value.