Match Any Character Using Regex In Java
In this short tutorial, we are going to shed light on how to match any character using regex in Java.
First, we will explain how to use a regular expression to match any single character. Then, we are going to showcase how to find multiple matches.
Finally, we will illustrate how to exclude and escape specific characters.
Regex to Match Any Character
Typically, we can use the dot/period pattern “.” to match a single character once.
In Java, the matched character can be any char except line terminators. However, we can address this limitation using the Pattern.DOTALL flag.
The default behavior of the dot changes depending on whether we combine it with other patterns.
For example, we used the dot pattern with the end pattern to remove the last character in a string.
Pattern Example | Description |
. | single char except a line terminator |
.? | matches zero or once any character except a line terminator |
.+ | matches any char that is not a line terminator once or more times |
.* | any character (zero or more times) except a line terminator |
\. | matches the dot character itself |
A.B | a string starting with A, followed by any char, and ending with B |
Basically, Java provides the Pattern class to denote a compiled regular expression.
So, let’s see how we can use it to compile a regex that matches any single character:
@Test
public void matchAnyCharacterUsingRegex() {
assertTrue(Pattern.matches(".", "A"));
// any char except new line
assertFalse(Pattern.matches(".", "\n"));
// using Pattern.DOTALL to match new line
assertTrue(Pattern.compile(".", Pattern.DOTALL)
.matcher("\n")
.matches());
assertTrue(Pattern.matches(".?", "C"));
assertFalse(Pattern.matches(".?", "CD"));
assertTrue(Pattern.matches(".+", "ABC"));
assertTrue(Pattern.matches(".*", "Z"));
assertTrue(Pattern.matches("A.Z", "AYZ"));
assertFalse(Pattern.matches("A.F", "AGH"));
}
Match Multiple Characters
The wildcard character “*“, called also asterisk, provides the easiest way to match any number of characters that are not line terminators.
For instance, we can use it with the dot ”.”, or the class “[]” patterns:
Pattern Example | Description |
B.*Y | finds a string that starts with B, followed by any number of chars, and ends with Y |
[0-9]* | multiple digits only |
[a-z]* | matches zero or multiple lowercase alphabets |
[A-Z]* | only zero or multiple uppercase alphabets |
[a-zA-Z]* | matches any number of alphabets |
Now, let’s create a test case to exemplify how to use the asterisk symbol to find any number of chars:
@Test
public void matchMultipleCharacterUsingRegex() {
assertTrue(Pattern.matches("[0-9]*", "12345"));
assertFalse(Pattern.matches("[0-9]*", "123ABC"));
assertTrue(Pattern.matches("[a-z]*", "abcd"));
assertTrue(Pattern.matches("[A-Z]*", "XYZ"));
assertTrue(Pattern.matches("[a-zA-Z]*", "yzAB"));
}
Match Range of Characters
Furthermore, we can use the square brackets with a hyphen to match a range of characters.
The hyphen acts as a range delimiter as it separates the starting char and the ending char.
For instance, we can use a regex with the [0-9] pattern to match only numbers.
Pattern Example | Description |
[0-4][6-8] | matches a number between 0 and 4, followed by a number ranging from 6 to 8 |
[a-z][1-6] | finds a lowercase character followed a number between 1 and 6 |
[c-d][1-5][A-N] | matches a char ranging between c and d, a number between 1 and 5, and an uppercase alphabet ranging from A to N |
Now, let’s demonstrate how to find a set of chars ranging between two given characters:
@Test
public void matchRangeOfCharacterUsingRegex() {
assertTrue(Pattern.matches("[0-4][6-8]", "17"));
assertFalse(Pattern.matches("[2-7][8-9]", "19"));
assertTrue(Pattern.matches("[a-z]zhwani[1-6]", "azhwani5"));
assertTrue(Pattern.matches("[a-z][A-Z]", "iN"));
}
Excluding Specific Characters
We can put the excluded characters inside the brackets prefixed by a caret [^..]. However, specifying the caret outside the brackets will mean the start of a string.
For example, [^abc] will match all chars except a, b, and c.
Please notice that the caret must be inside the brackets. Otherwise, the pattern will have another meaning.
Pattern Example | Description |
[^A] | the character A will be excluded from the matching character |
[^0-9] | matches a character that is not a digit |
[^A-Z] | Excludes uppercase alphabets |
Finally, we are going to see how to exclude characters using a regular expression in Java:
@Test
public void ExcludeCharactersUsingRegex() {
assertTrue(Pattern.matches("[^a-z]", "A"));
assertFalse(Pattern.matches("[^0-1]", "1"));
assertTrue(Pattern.matches("[^A-Z]", "z"));
}
Escaping Special Characters
Sometimes, we want to match a character that has a special meaning in regular expressions such as dot, backslash, or caret.
To achieve this, we need to prefix the matched char with a backslash. For instance, to match a dot, we need to use the pattern “\.”.
Conclusion
To sum it up, in this tutorial we explained how to match any character using regex in Java.
Along the way, we have seen how to use regular expressions to match multiple chars.
Lastly, we showcased how to exclude and escape specific characters.