Example CodeJava

How To Use Regular Expressions In Java

3 Mins read
How To Use Regular Expressions In Java

Have you ever found yourself needing to search through a string and check whether a condition is true or not? For instance, verifying whether or not a string is a valid email or not or maybe if a string contains some sort of pattern. Well, if you have and you’re clueless as to how you should go about doing that, keep reading!

Regular expressions can certainly do the job. Although, using regular expressions (or regex if you will) can be quite tricky. This was certainly the case for me. I had a hard time understanding it because nobody really explained regular expressions well enough. I’m hoping to change that now.

How To Use Regular Expressions

When you want to use regular expressions, the first thing you do is define a ‘pattern’ for which you will check if a string matches. This can be anything, really. An example could be checking if a string starts with some letters or numbers followed by «@», basically just verifying whether you wrote the string is a valid email or not. We know that emails require some text, then the @ symbol, some more letters, a dot and ending with some more letters. I won’t jump right into explaining how to write a pattern like that yet because I need to explain the syntax first.

Let’s start with a simple example. Let’s check if a string contains a certain word. We’ll pick coding.

public static void main(String[] args) {
String regex = "Coding";
String str = "Coding is love, Coding is life";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println(regex + " was found in string: " + str);
} else {
System.out.println(regex + " was not found in string: " + str);
}

}

What we did first was declaring two variables: regex and str. The regex variable holds the pattern we defined. We haven’t actually used any advanced syntax yet, but we’ll get to that shortly.

The str variable is the string which we want to compare to the pattern we defined. Since we don’t need any more variables, we can start compiling the pattern which is what the next line means. We need to use the Pattern class in order to do this and is absolutely necessary if we want to use regex. So in our pattern object’s compile method, we pass in our pattern which is the regex string.

Then we need to create an object of the Matcher class so we can use our pattern object’s matcher method. Inside the p.matcher method, we pass in the string we want to compare to the pattern.

The rest if pretty self-explanatory. We use the m.find method to check whether or not the string we compared to the pattern actually matches the pattern. In simpler words, we used the method to check if str contained the word «Coding».

This is as basic as it gets, but let’s move on to something a little bit more complicated; let’s use the syntax.

Using The Syntax

Let’s say we want to check if the string actually matches a pattern and not just a string. I’ll show you an example where we check if a string starts with a letter then a string. We only need to modify our regex and str variable. Let’s start with the regex variable. It should look like this.

String str = "^[\d]+[a-zA-Z]+";

That looks really complicated, but don’t let it scare you. I’ll explain what everything means.

Let’s start with ^ which basically means start of line or string. Since we’re checking if the string starts with a letter, we need to go back to the start of the line. However, the ^ has another meaning if inside two square brackets but I’ll get to that later.

Let’s move on to the [\d] which means any digit. In Java, we need to type two backslashes because we need to escape the backslash. This is because the backslash is actually a thing in Java. For instance, n. You could also write [0-9] which means the same thing. After the closing square bracket, you encounter the + symbol. This means that there needs to be one or more occurances of what we’ve written. Then we get some more square brackets and within them is a-zA-Z. Try and guess what that means. If you can’t, it means that any letter, both lowercase and uppercase is accepted. And at the end, we have a + symbol.

So if at the start of our string, we begin with a digit (1 or more) followed by one or more letters, our check will return true.

Regular expressions are really complicated but they are really powerful too. The only way you’re really going to grasp the concept is if you experiment yourself. Create your own checks.

There are more symbols though, and I’ll get to explaining those right now.

This article is contributed by Sander Strand. Sander has a website CodingForNewbies created solely for the purpose of teaching others about programming.

Leave a Reply

Your email address will not be published. Required fields are marked *