In my last practicum-related blog post, I adumbrated some of the preparatory procedures involved when setting out to develop a software application. You can read this post here. In my case, the application is a text differentiation tool (which highlights the differences between two text strings) for the new, revised version of the Versioning Machine, due to be rolled out sometime later this summer.
<p id=”string1″>The quick brown fox jumps over the lazy dog </p>
<p id=“string2″> The quick brown fox jumped over the lazy dog</p>
The two strings above, enclosed in <p> tags and each with their own unique id attribute, serve as proxies for the VM’s HTML sample files, through which I will hopefully be working at a later date, given my prototype is in working order.
var String1 = document.getElementById(“string1″);
var String2 = document.getElementById(“string2″);
One thing I need to keep in mind, however, is that the VM code may not lend itself to being accessed through the getElementById() method, and so I may need to figure out another way of grabbing the appropriate elements if I were to begin implementing my JS code within the VM. All I am doing here is creating a working HTML and JS environment in order to experiment with different JS methods and to see if they may be of use for the application I intend to build.
Once I get the correct HTML elements and have stored them in variables, I will need to figure out how to “split” the strings by each word. This is because my text comparison tool needs to differentiate each single word between two strings. As of yet, the two HTML strings stored inside var String1 and var String2 are stored as a series of characters – not specifically word by word. The computer at the moment sees each string as one, made up of a series of characters(like letters) and whitespace(which is also a character). In order to manipulate the way the computer reads the strings – that is, word for word, closer to how a human parses a sentence – we need to break, or split, each string up like this. As I mentioned, whitespace is in and of itself a character, just like a letter. Therefore, if we were to split each string up by the whitespace character, this means we would capture each word within those two areas of whitespace. Don’t forget though, these words are still in a sense strings (series of characters) but we’ve just found a way that the computer will recognise each of these groups of characters as distinct entities. Splitting characters up within a string by the whitespace is very easy. All you have to do is use the string split method, followed by (“”). The “” between the parentheses is where we ask the computer to break the split by whitespace. If, instead, we had string split followed by (“b”), all the characters in our string would be broken up each time the computer comes across the least “b” in a string.
If then I were to split the variable String1 up by whitespace:
var String1 = document.getElementById(“string1”);
var myArray1 = String1.split(“”);
The result would be that it is broken up, like this:
The, quick, brown, fox, jumps, over, the, lazy, dog
If I were to do String1.split(“b”), it would then come out like this:
The quick ,rown fox jumps over the lazy dog.
Employing the split() method, then, on both var String1 and var String2 means that we have two arrays of substrings, so to speak, made up of blocks of characters that are essentially words. As with the example of the String1.split(“”) above, String1 as a substring array is now passed into the variable myArray1. We would do the exact procedure for String2, and we could call that array myArray2.
Arrays in JS are essentially a special type of object that can store multiple values in a single variable. Each of these values has an index number. The index numbers for each array are the same: beginning at 0 and so on for the array’s length. The myArray number indexes then are:
Index  = The
Index  = quick
Index  = brown
Index  = fox
Index  = jumps
Index  = over
Index  = the
Index  = lazy
Index  = dog
For our other array, myArray2, that is now storing the String2 variable information following the use of the split() method, the character within each index number is identical, except for Index, which instead of ‘jumps’ is ‘jumped’.
The next phase, then, will require comparing these two arrays, Array1 and Array2, getting the computer to determine which Index is not the same (Index) and then HIGHLIGHTING this difference. The highlighting can be done very easily with the simple CSS background-color property. So, whenever the computer comes across a difference, it will need to somehow apply this background-color highlight on the Index in question.
At the moment, I need to do more research into how to compare two arrays, and then, once a difference is picked-up on, I need to find out how to somehow take this ‘difference’ out of the array so that it can be highlighted. I intend to keep you all posted on my progress.