class SequenceMatcher (View source)

A Diff Sequence Matcher

Methods

__construct(string|array $a, string|array $b, string|array $junkCallback = null, $options)

The constructor. With the sequences being passed, they'll be set for the sequence matcher and it will perform a basic cleanup & calculate junk elements.

void
setOptions(array $options)

Set options for the matcher.

void
setSequences(string|array $a, string|array $b)

Set the first and second sequences to use with the sequence matcher.

void
setSeq1(string|array $a)

Set the first sequence ($a) and reset any internal caches to indicate that when calling the calculation methods, we need to recalculate them.

void
setSeq2(string|array $b)

Set the second sequence ($b) and reset any internal caches to indicate that when calling the calculation methods, we need to recalculate them.

array
findLongestMatch(int $alo, int $ahi, int $blo, int $bhi)

Find the longest matching block in the two sequences, as defined by the lower and upper constraints for each sequence. (for the first sequence, $alo - $ahi and for the second sequence, $blo - $bhi)

bool
linesAreDifferent(int $aIndex, int $bIndex)

Check if the two lines at the given indexes are different or not.

array
getMatchingBlocks()

Return a nested set of arrays for all of the matching sub-sequences in the strings $a and $b.

array
getOpCodes()

Return a list of all of the opcodes for the differences between the two strings.

array
getGroupedOpcodes(int $context = 3)

Return a series of nested arrays containing different groups of generated opcodes for the differences between the strings with up to $context lines of surrounding content.

float
ratio()

Return a measure of the similarity between the two sequences.

Details

__construct(string|array $a, string|array $b, string|array $junkCallback = null, $options)

The constructor. With the sequences being passed, they'll be set for the sequence matcher and it will perform a basic cleanup & calculate junk elements.

Parameters

string|array $a

A string or array containing the lines to compare against.

string|array $b

A string or array containing the lines to compare.

string|array $junkCallback

Either an array or string that references a callback function (if there is one) to determine 'junk' characters.

$options

void setOptions(array $options)

Set options for the matcher.

Parameters

array $options

Return Value

void

void setSequences(string|array $a, string|array $b)

Set the first and second sequences to use with the sequence matcher.

Parameters

string|array $a

A string or array containing the lines to compare against.

string|array $b

A string or array containing the lines to compare.

Return Value

void

void setSeq1(string|array $a)

Set the first sequence ($a) and reset any internal caches to indicate that when calling the calculation methods, we need to recalculate them.

Parameters

string|array $a

The sequence to set as the first sequence.

Return Value

void

void setSeq2(string|array $b)

Set the second sequence ($b) and reset any internal caches to indicate that when calling the calculation methods, we need to recalculate them.

Parameters

string|array $b

The sequence to set as the second sequence.

Return Value

void

array findLongestMatch(int $alo, int $ahi, int $blo, int $bhi)

Find the longest matching block in the two sequences, as defined by the lower and upper constraints for each sequence. (for the first sequence, $alo - $ahi and for the second sequence, $blo - $bhi)

Essentially, of all of the maximal matching blocks, return the one that starts earliest in $a, and all of those maximal matching blocks that start earliest in $a, return the one that starts earliest in $b.

If the junk callback is defined, do the above but with the restriction that the junk element appears in the block. Extend it as far as possible by matching only junk elements in both $a and $b.

Parameters

int $alo

The lower constraint for the first sequence.

int $ahi

The upper constraint for the first sequence.

int $blo

The lower constraint for the second sequence.

int $bhi

The upper constraint for the second sequence.

Return Value

array

Array containing the longest match that includes the starting position in $a, start in $b and the length/size.

bool linesAreDifferent(int $aIndex, int $bIndex)

Check if the two lines at the given indexes are different or not.

Parameters

int $aIndex

Line number to check against in a.

int $bIndex

Line number to check against in b.

Return Value

bool

True if the lines are different and false if not.

array getMatchingBlocks()

Return a nested set of arrays for all of the matching sub-sequences in the strings $a and $b.

Each block contains the lower constraint of the block in $a, the lower constraint of the block in $b and finally the number of lines that the block continues for.

Return Value

array

Nested array of the matching blocks, as described by the function.

array getOpCodes()

Return a list of all of the opcodes for the differences between the two strings.

The nested array returned contains an array describing the opcode which includes: 0 - The type of tag (as described below) for the opcode. 1 - The beginning line in the first sequence. 2 - The end line in the first sequence. 3 - The beginning line in the second sequence. 4 - The end line in the second sequence.

The different types of tags include: replace - The string from $i1 to $i2 in $a should be replaced by the string in $b from $j1 to $j2. delete - The string in $a from $i1 to $j2 should be deleted. insert - The string in $b from $j1 to $j2 should be inserted at $i1 in $a. equal - The two strings with the specified ranges are equal.

Return Value

array

Array of the opcodes describing the differences between the strings.

array getGroupedOpcodes(int $context = 3)

Return a series of nested arrays containing different groups of generated opcodes for the differences between the strings with up to $context lines of surrounding content.

Essentially what happens here is any big equal blocks of strings are stripped out, the smaller subsets of changes are then arranged in to their groups. This means that the sequence matcher and diffs do not need to include the full content of the different files but can still provide context as to where the changes are.

Parameters

int $context

The number of lines of context to provide around the groups.

Return Value

array

Nested array of all of the grouped opcodes.

float ratio()

Return a measure of the similarity between the two sequences.

This will be a float value between 0 and 1.

Out of all of the ratio calculation functions, this is the most expensive to call if getMatchingBlocks or getOpCodes is yet to be called. The other calculation methods (quickRatio and realquickRatio) can be used to perform quicker calculations but may be less accurate.

The ratio is calculated as (2 * number of matches) / total number of elements in both sequences.

Return Value

float

The calculated ratio.