Java Regex or Regular expression is a string/sequence of characters that allows us to create patterns for matching, locating, and managing strings. In other words, we use Java Regex for string manipulation.
We have three classes and one interface inside java.util.regex package:
- Pattern class – does not have public constructors. We obtain an object of a Pattern class using its static method compile() that accepts a regular expression as the first parameter.
- Matcher class – used for performing match operations against an input string. Like Pattern class, it does not provide public constructors. We can get the object by invoking the matcher() method on a Pattern object.
- PatternSyntaxException class – an unchecked exception that indicates a syntax error in a regular expression pattern.
- MatchResult interface – represents the result of a match operation
Regex Pattern class methods
- compile(String regex) – Compiles the given regular expression into a pattern.
- compile(String regex, int flags) – Compiles the given regular expression into a pattern with the given flags.
- flags() – Returns this pattern’s match flags.
- matcher(CharSequence input) – Creates a matcher that will match the given input against this pattern.
- matches(String regex, CharSequence input) – Compiles the given regular expression and attempts to match the given input against it.
- pattern() – Returns the regular expression from which this pattern was compiled.
- quote(String s) – Returns a literal pattern String for the specified String.
- split(CharSequence input) – Splits the given input sequence around matches of this pattern.
- split(CharSequence input, int limit) – Splits the given input sequence around matches of this pattern.
- toString() – Returns the string representation of this pattern.
Most used methods of Matcher class
- boolean matches() – test whether the regular expression matches the pattern.
- boolean find() – attempts to find the next subsequence of the input sequence that matches the pattern.
- boolean find(int start) – finds the next expression that matches the pattern from the given start number.
- String group() – returns the matched subsequence.
- int start() – returns the start index of the previous match.
- int end() – returns the offset after the last character matched.
- int groupCount() – returns the total number of the matched subsequence.
Using Regex in Java – examples
Example 1:
import java.util.regex.Matcher; import java.util.regex.Pattern; public class Test { public static void main(String[] args) { Pattern p = Pattern.compile(".d"); // . (DOT) is for single character Matcher matcher = p.matcher("zd"); System.out.println(matcher.matches()); } }
Output: true
Here we specified regex for a word with two characters, and the second must be d. And since we passed zd, we got true in the output.
Example 2:
import java.util.regex.Matcher; import java.util.regex.Pattern; public class Test { public static void main(String[] args) { String stringToBeMatched = "London is the number 1 city in UK"; String pattern = "(.*)(\\d+)(.*)"; Matcher matcher = Pattern.compile(pattern).matcher(stringToBeMatched); if (matcher.find()) { System.out.println("group 0: " + matcher.group(0)); System.out.println("group 1: " + matcher.group(1)); System.out.println("group 2: " + matcher.group(2)); } else { System.out.println("No match found!"); } } }
Output: group 0: London is the number 1 city in UK group 1: London is the number group 2: 1
Regular Character classes
Here are some examples of Regular Character classes that we can use:
. (DOT) | Any character |
\d | A digit: [0-9] |
\D | A non-digit: [\^0-9] |
\s | A whitespace character: [ \t\n\x0B\f\r] |
\S | A non-whitespace character: [\^\s] |
\w | A word character: [a-zA-Z_0-9] |
\W | A non-word character: [\^\w] |
[abc] | a, b, or c |
[^abc] | Any character except a, b, or c |
[a-zA-Z] | a through z or A through Z inclusive |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] |
Using Regular Character classes – examples
Example 1:
import java.util.regex.Pattern; public class Test { public static void main(String[] args) { System.out.println(Pattern.matches("[dzb]", "olkm")); // false (not d or z or b) System.out.println(Pattern.matches("[dzb]", "b")); // true (among d or z or b) System.out.println(Pattern.matches("[dzb]", "dazbaze")); // false (z specified more than once) } }
Example 2:
import java.util.regex.Pattern; public class Test { public static void main(String[] args) { System.out.println(Pattern.matches("\\D", "a")); // true - provided a non digit System.out.println(Pattern.matches(".", "ab")); // false - two characters provided System.out.println(Pattern.matches("\\s", "a")); // false- provided a non-whitespace character } }
Example 3:
Create a regular expression that accepts alphanumeric characters, only and its length must be 5 characters long.
import java.util.regex.Pattern; public class Test { public static void main(String[] args) { System.out.println(Pattern.matches("[a-zA-Z0-9]{5}", "abcde")); System.out.println(Pattern.matches("[a-zA-Z0-9]{5}", "12bce")); System.out.println(Pattern.matches("[a-zA-Z0-9]{5}", "practise")); System.out.println(Pattern.matches("[a-zA-Z0-9]{5}", "count")); } }
Output: true true false true
Example 4:
Create a regular expression that accepts 5 digits input that starts with 1 or 7 or 9:
import java.util.regex.Pattern; public class Test { public static void main(String[] args) { Pattern pattern = Pattern.compile("[179]{1}[0-9]{4}"); System.out.println(pattern.matcher("72135").matches()); System.out.println(pattern.matcher("123a5").matches()); System.out.println(pattern.matcher("abc47").matches()); System.out.println(pattern.matcher("93214").matches()); System.out.println(pattern.matcher("abcndkoupz").matches()); } }
Output: true false false true false
This was an introduction to regex in Java. Check out the official page for a summary of regular-expression constructs.
Happy coding!