What is the bell character, you might ask? In the old days of computing, before there was surround sound, you could "print" a character to the screen that made a "beep" sound. If you put a bell character (ASCII 7) in a string, it would beep to the screen. These days, the bell character is not used for much...if anything. As a ColdFusion programmer, you can make use of this not-often-used character for different situations. Some lists seem like they are unparseable using regular ColdFusion functions, and some strings might not even seem like lists. This article will show a couple ways to parse otherwise difficult lists.
Many ASCII characters are used in the course of an application -- line breaks are ASCII 10, carriage returns are ASCII 13, tab characters are ASCII 9, among others. For the types of uses that we will be talking about, the bell character is not likely to be used. We will be using it mostly as a delimiter. Delimiters are used in lists. Lists can be turned into arrays. Arrays can be worked with effectively in ColdFusion and other languages. The bell is a likely unprintable character to use for this, although you can use others as well.
A simple ColdFusion script might have a need to convert a list into an array. For example, lets use a database search script as an example. A search box on your site might allow for the user to enter multiple words, that you will later want to split into an array to be able to create your own custom SQL statement. The user types in:
database search coldfusion
It is unlikely that the phrase "database search coldfusion" will match any text in your database, however the individual words might. Obviously, the user wants to find all relevant search results that contain each one of these words. Your SQL statement should look like this:
SELECT somefield FROM sometable
WHERE
(somefield LIKE '%database%') AND
(somefield LIKE '%search%') AND
(somefield LIKE '%coldfusion%')
You say "that's easy, just split the string into an array using the space character as a delimiter". Ok, that will work. However, what if the user types in the following:
database,search,coldfusion
Now, the delimiter has become a comma. Even worse:
database, search, coldfusion
Now, you have a space and a comma as a delimiter. The easy solution, in my opinion, is to decide in advance what your possible delimiters will be, then change them all to the bell character. Having done that, you can split the string into an array at the bell character and then use the array to perform your tasks. You have to split the string at a group of characters, however, since there could be a space, two spaces, a comma, a comma and a space, or other characters you decide in advance. In the preceding example, if you had split the string at a comma AND a space, you would have had a problem with the space and comma together. Using a regular expression, we'll replace all possible delimiters with the bell:
<cfset mystring = ReReplace(mystring, "[\s,;]+", chr(7),"all")>
The regular expression tells us "any time you see one or more of any white space, comma, or semi-colon character, replace the group with a single bell character." Now, a simple ListToArray function will work to turn the string into an array:
<cfset myarray = ListToArray(mystring, chr(7))>
Now we can take this to the logical conclusion and write a SQL statement dynamically:
<cfset mywhereclause = " WHERE 1=1">
<cfloop from="1" to="#ArrayLen(myarray)#" index="i">
<cfset myWhereClause = "#myWhereClause# AND (somefield LIKE '%#myarray[i]#%')">
</cfloop>
<cfquery datasource="mydsn" name="rs">
SELECT myfield FROM mytable #PreserveSingleQuotes(myWhereClause)#
</cfquery>
Ok, this is somewhat useful. This could have been done other ways as well, such as running multiple replace statements converting all commas, spaces, and semi-colons into one specific character, but the bell solution is simple and elegant. The following string can not be split so easily:
/* Create mytable */
CREATE TABLE dbo.mytable (
id int IDENTITY (1, 1) NOT NULL ,
personid int NULL ,
description nvarchar (50) NULL
)
GO
/* Dumping data for table mytable */
SET IDENTITY_INSERT dbo.[mytable] ON
INSERT INTO [dbo].[mytable] (id, personid, description)
VALUES (1, 1, 'This is the first person, who is home')
GO
INSERT INTO [dbo].[mytable] (id, personid, description)
VALUES (2, 1, 'This is the first person''s second record')
GO
INSERT INTO [dbo].[mytable] (id, personid, description)
VALUES (3, 2, 'This is the second person who has gone fishing')
GO
This is a typical database dump script from SQL Server. Is this even a list? If it is, what is the delimiter here? The word "GO" looks like a likely candidate, but the word "GO" could easily be contained in the data ("gone fishing"). We'll use whitespace (a carriage return/line break with any number of space characters before or after) followed by GO followed by another whitespace. There could be any number of line breaks or spaces around the word GO, so a regular expression is necessary. For something like this, there is no ColdFusion function to turn an unknown multiple character string into a delimiter. There are scripts around the web (such as at www.cflib.org) that do a split based on multiple characters, but most of them rely on loops. Loops are fine, but usually more complicated and slower when you have longer strings. It can be done easily using the technique shown earlier. Assuming mystring holds the database script:
<cfset mystring = ReReplaceNoCase(mystring, "\s+GO\s+",chr(7),"all")>
This replaces the combination of one or more whitespace characters followed by GO followed by one or more whitespace characters with the bell character. Now, you can simply use the ListToArray function once again:
<cfset myarray = ListToArray(mystring, chr(7))>
To further extrapolate on this particular case, the SQL script can now be executed in a loop:
<cfloop from="1" to="#ArrayLen(myarray)#" index="i">
<cfset sql = myarray[i]>
<cfquery datasource="mydsn">
#PreserveSingleQuotes(sql)#
</cfquery>
</cfloop>
Using the knowledge from the above examples, we can write a simple Split function for CFML that uses a regular expression to split a string (list) into an array:
<cffunction name="Split">
<cfargument name="stringToSplit" type="string">
<cfargument name="regex" type="string">
<cfset stringToSplit = ReReplaceNoCase(
stringToSplit, "#regex#",chr(7),"all")>
<cfreturn ListToArray(stringToSplit, chr(7))>
</cffunction>
To call the function, pass in the string and the regular expression. The return value will be the array.
<cfset myArray = Split(myString, myRegex)>
Sometimes the obvious programming choices are not the easiest ways to do things. Looking outside the box is often useful when programming ColdFusion and other languages. Using the bell character is not an obvious ColdFusion technique, but one that can be very useful in many situations.