Episodes
Saturday Nov 11, 2006
RegExp classes
Saturday Nov 11, 2006
Saturday Nov 11, 2006
Character Classes
Character classes are groups of characters to test for. By enclosing characters inside of square brackets, you are effectively telling the regular expression to match the first character, the second character, the third character, or so on. For example, to match the characters a, b, and c, the character class is [abc]. This is called a simple class, because it specifies the exact characters to look for.
Simple Classes
Suppose you want to match "bat", "cat", and "fat". It is very easy to use a simple character class for this purpose:
var strCheck = "a bat, a Cat, a fAt baT, a faT cat";
var reBatCatRat = /[bcf]at/gi;
var arrMatches = strCheck.match(reBatCatRat);
Negation Classes
At times you may want to match all characters except for a select few. In this case, you can use a negation class, which specifies characters to exclude. For example, to match all characters except a and b, the character class is [^ab]. The caret (^) tells the regular expression that the character must not match the characters to follow.
Going back to the previous example, what if you only wanted to get words containing at but not beginning with b or c?
var strCheck = "a bat, a Cat, a fAt baT, a faT cat"; var reBatCatRat = /[^bc]at/gi; var arrMatches = strCheck.match(reBatCatRat);
In this case, arrMatches contains "fAt" and "faT", because these strings match the pattern of a sequence ending with at but not beginning with b or c.
Range Classes
Up until this point, the character classes required you to type all the characters to include or exclude. Suppose that you want to match any alphabet character, but you really don't want to type every letter in the alphabet. Instead, you can use a range class to specify a range between a and z: [a-z]. The key here is the dash (-), which should be read as through instead of minus (so the class is read as a through z not a minus z).
Important | Note that [a-z] matches only lowercase letters unless the regular expression is set to case insensitive by using the i option. To match only uppercase letters, you must use [A-Z]. |
Range classes work whenever the characters you want to test are in order by character code. Consider the following example:
var strCheck = "num1, num2, num3, num4, num5, num6, num7, num8, num9"; var reOneToFour = /num[1-4]/gi; var arrMatches = strCheck.match(reOneToFour);
After execution, arrMatches contains four items: "num1", "num2", "num3", and "num4" because they all match num and are followed by a character in the range 1 through 4.
Important | You can also negate range classes so as to exclude all characters within a given range. For example, to exclude characters 1 through 4, the class is [^1-4]. |
Combination Classes
A combination class is a character class that is made up of several other character classes. For instance, suppose you want to match all letters a through m, numbers 1 through 4, and the new line character. The class looks like this:
[a-m1-4\n]
Note that there are no spaces between the different internal classes.
Important | JavaScript/ECMAScript doesn't support union and intersection classes as do other regular expression implementations. This means you can't make patterns such as [a-m[p-z]] or [a-m[^b-e]]. |
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.