PHPDiff’s documentation¶
Contents:
Two-way diff¶
Two-way diffs are the simplest diffs there are. It simply produces a change set between two documents. You can create
two-way diffs by using the Differ
class.
Examples¶
The example below shows a how to create a simple diff between two documents:
use CHItA\PHPDiff\DifferBase;
use CHItA\PHPDiff\Differ;
$differ = new Differ();
$diff = $differ->diff(
array('a', 'b', 'b', 'a', 'c', 'b', 'b', 'a'),
array('a', 'b', 'a', 'c', 'c', 'c', 'c', 'a', 'd')
);
// $diff will contain an array with the following structure:
//
// array(
// array(
// 'type' => DifferBase::UNCHANGED,
// array('a', 'b')
// ),
// array(
// 'type' => DifferBase::REMOVED,
// array('b')
// ),
// array(
// 'type' => DifferBase::UNCHANGED,
// array('a', 'c')
// ),
// array(
// array(
// 'type' => DifferBase::REMOVED,
// array('b', 'b')
// ),
// array(
// 'type' => DifferBase::ADDED,
// array('c', 'c', 'c')
// )
// ),
// array(
// 'type' => DifferBase::UNCHANGED,
// array('a')
// ),
// array(
// 'type' => DifferBase::ADDED,
// array('d')
// )
// )
Note
For more information about the diff structure generated by the library, please read the Diff structure section.
Three-way diffs¶
Three-way diffs are the set of changes between a parent and two child documents. There are multiple algorithms for computing three-way diffs which are producing different results. PHPDiff implements the Weave merge and Three-way merge algorithms out of the box, while providing an interface for creating other implementations as well.
Weave merge is the default three-way merge implementation provided by this library, however, it can be easily changed
either by providing an other three-way diff algorithm implementation in Differ3
‘s constructor, or by calling the
setter function on that class.
Description of the algorithms can be found on Wikipedia <https://en.wikipedia.org/wiki/Merge_(version_control)>.
Weave merge¶
Weave merge is a simple algorithm, which produces a diff that contains all units that are present in both modified versions of the files, and none that were deleted from either of them.
This algorithm only produces a merge conflict when the order of the units in the merged documents cannot be determined.
Three-way merge¶
Three-way merge is maybe the most common three-way merge algorithm out there. It basically produces an output which contains all the changes that are present in either of the modified versions of the document, however, when both documents containing changes at the same place, it produces a conflict.
Examples¶
Simple three-way diff example, using the three-way merge algorithm:
use CHItA\PHPDiff\Diff3Algorithm\ThreeWayMerge;
use CHItA\PHPDiff\Differ;
use CHItA\PHPDiff\Differ3;
use CHItA\PHPDiff\DifferBase;
use CHItA\PHPDiff\LongestCommonSubsequence\Algorithm\Hirschberg;
$differ = new Differ3(
new ThreeWayMerge(new Differ(null, new Hirschberg()))
);
$diff = $differ->diff(
array('a', 'b', 'c', 'd', 'e', 'f'), // Original document
array('a', 'b', 'w', 'x', 'y', 'e'), // Modified 1
array('a', 'q', 'r', 's', 'b', 'f') // Modified two
);
// $diff will contain an array with the following structure:
//
// array(
// array(
// 'type' => DifferBase::UNCHANGED,
// array('a')
// ),
// array(
// 'type' => DifferBase::ADDED,
// array('q', 'r', 's')
// ),
// array(
// 'type' => DifferBase::UNCHANGED,
// array('b')
// ),
// array(
// array(
// 'type' => DifferBase::REMOVED,
// array('c', 'd', 'e', 'f')
// ),
// array(
// 'type' => DifferBase::ADDED,
// array('w', 'x', 'y')
// )
// )
// )
Note
For more information about the diff structure generated by the library, please read the Diff structure section.
Diff structure¶
The library returns diffs in an array. These arrays contain the changes, which are grouped by the type of the changes in the following format:
array(
$change1,
$change2,
)
Note
Where $changeN
can be any of the change blocks listed below.
Diff change blocks¶
Unchanged block is used when all of the two (or three) documents have the same units in common.
array(
'type' => DifferBase::UNCHANGED,
array('all', 'common', 'units', 'until', 'the', 'next', 'change type')
)
Added block is used when the original version did not contain some units that are present in the modified version(s).
array(
'type' => DifferBase::ADDED,
array('all', 'common', 'units', 'until', 'the', 'next', 'change type')
)
Removed block is used when the original version did contain some units that are not present in the modified version(s).
array(
'type' => DifferBase::REMOVED,
array('all', 'common', 'units', 'until', 'the', 'next', 'change type')
)
Edit block is used when the modified version(s) contain both additions and deletions compared to the original version.
array(
array(
'type' => DifferBase::REMOVED,
array('removed', 'units')
),
array(
'type' => DifferBase::ADDED,
array('added', 'units')
)
)
Conflict block is used when a merge conflict cannot be resolved. This could only occur in three-way diffs. Note that because these units are neither present nor removed from the document, in general no removed lines are returned before this block (even where some units are removed from both modified versions).
array(
'type' => DifferBase::CONFLICT,
array('conflicting', 'units', 'in', 'version1'),
array('conflicting', 'units', 'in', 'version2'),
)
Longest Common Subsequence¶
To generate diffs we solve the longest common subsequence problem for the documents to determine which lines are the ones that did not changed.
This library provides two implementations out of the box for the longest common subsequence problem, one that is time efficient (dynamic programming approach) and another that is memory efficient (Hirschberg’s algorithm).
There is also an option to implement a strategy that selects the solver based on the inputs. For this, you need to
implement the CHItA\PHPDiff\LongestCommonSubsequence\Strategy\StrategyInterface
.
Custom comparison¶
By implementing CHItA\PHPDiff\Comparison\ComparisonInterface
, you may add a custom data comparison implementation
to the algorithm. This could be useful if you would like to compare the trimmed versions of the units for example.
Warning
Comparison algorithm never alters the content of the passed elements. Any manipulation of the input data in the comparison algorithm will not be present in the output.
However, please be aware that when you use this option, the returned units from two-way diffs will contain the units from the modified version (second parameter of the Differ::diff() method). Please also note, that when using custom comparisons with three-way diffs, the units in the output could be from either of the modified documents.
Example¶
An example comparison algorithm that trims whitespaces from the end of the units before comparing them.
use CHItA\PHPDiff\Comparison\ComparisonInterface;
class TrimCompare implements ComparisonInterface
{
public function compare($value1, $value2)
{
return rtrim($value1) === rtrim($value2);
}
}
use CHItA\PHPDiff\DifferBase;
use CHItA\PHPDiff\Differ;
$differ = new Differ();
$differ->setComparisonAlgorithm(new TrimCompare());
$diff = $differ->diff(
array('a', 'b', 'b', 'c'),
array(' a', 'b ', 'b ', 'c')
);
// $diff will contain an array with the following structure:
//
// array(
// array(
// array(
// 'type' => DifferBase::REMOVED,
// array('a')
// ),
// array(
// 'type' => DifferBase::ADDED,
// array(' a')
// )
// ),
// array(
// 'type' => DifferBase::UNCHANGED,
// array('b ', 'b ', 'c')
// )
// )
Custom Sequencing strategy¶
By default, the diff algorithms expect arrays as the input, in which case it is assumed, that each element of the array is a unit (an element of the sequence and a single letter of the alphabet which the sequence is generated from).
However, you can implement CHItA\PHPDiff\SequencingStrategy\SequencingStrategyInterface
and handle any type of input
yourself.
Example¶
In the example below, the sequence is generated from a string where the units are the bytes of the string (ascii characters).
use CHItA\PHPDiff\SequencingStrategy\SequencingStrategyInterface;
class MySequencer implements SequencingStrategyInterface
{
public function getSequence($dataSet)
{
return str_split($dataSet);
}
}