Learn Everything about Java String Split Method

This is a simple tutorial for java beginners who are struggling on the String split() method. I have seen many students wondering about how to split the string in java with the help of split() method.
In this tutorial, I am trying to show some simple exercises which can be used for learning and understanding String split and a few other methods. Also, you can find Java Assignment help at Assign code

  1. Understanding the two flavors of the split method
  2. Java.lang.String class contains two flavors of the split() method, which can be used to split a string. Here are the Javadoc definition of these two methods:
    public String[] split(String regex) - This method splits the string based on the given regular expression and returns array of strings.
    public String[] split(String regex,int limit) -  This method splits the string based on the given regular expression and returns array of strings. The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.
    So for example if you have a string ["This is a String"] then a) calling a split(" ") method (space as delimiter) on this string will convert it to an array of four strings [1] - This [2] - is [3] - a [4] - String a) calling a split(" ", 2) method on this string will convert it to an array of 2 strings [1] - This [2] - is a String
  3. Simplest thing first - Using the split() method with one character delimiter
  4. As you can see the split method supports regular expression, therefore, we should be able to use one or more characters as a delimiter. To start with the simple example I have chosen comma "," as a delimiter for my sentence. So for example, if you have a string ["This, is, a, String"] then a split using comma as delimiter should result in the array of four strings. See below example code.
    public static void  singleCharacterDelimiterTest() {
    String str = "This,is,a,String";
    String delimiter = ",";
    String[] splitStrings = str.split(delimiter);
    for (int i = 0; i < splitStrings.length; i++) {
    System.out.println(splitStrings[i]);
    }
    }
    
  5. How about using another string as a delimiter?
  6. Yes, you can always use a string as a delimiter. Remember that delimiter is a regular expression, therefore, any valid regular expression can be passed as a value in this parameter. So for example, if you have a string ["ThisWORDisWORDaWORDString"] then a split using the word [WORD] as a delimiter is going result in an array of four strings. See below example code.
    public static void  singleWordDelimiterTest() {
    String str = "ThisWORDisWORDaWORDString";
    String delimiter = "WORD";
    String[] splitStrings = str.split(delimiter);
    for (int i = 0; i < splitStrings.length; i++) {
    System.out.println(splitStrings[i]);
    }
    } 
    
    
    The output of this method will be
    [0] This
    [1] is
    [2] a
    [3] String
    
  7. How about using multiple characters as a delimiter?
  8. There can be some scenarios where you may want to split the string based on multiple characters. So for example, if you have a string ["This, is: a; String"] then a split using [comma or colon or semi-colon] as a delimiter is required. You need not to be an expert in Regular Expression to do this, just a simple use of pipe (|) operator should be sufficient to deal with this. so the regular expression for these 3 delimiters will be [",|:|;"] See below example code.
    public static void  multiCharacterDelimiterTest() {
    String str = "This,is:a;String";
    String delimiter = ",|:|;";
    String[] splitStrings = str.split(delimiter);
    for (int i = 0; i < splitStrings.length; i++) {
    System.out.println("[" + i + "] "+ splitStrings[i]);
    }
    }
    
    The output of this method will be
    [0] This
    [1] is
    [2] a
    [3] String
    
  9. How about using multiple words as a delimiter?
  10. Let's assume a hypothetical scenario where you may want to split the string based on multiple words. So for example, if you have a string ["ThisWORD1isWORD2aWORD3String"] then a split using [WORD1 or WORD2 or WORD3] as a delimiter is required. Again the use of pipe (|) operator should be sufficient to deal with this. so the regular expression for these 3 delimiters will be ["WORD1|WORD2|WORD3"] See below example code.
    public static void  multiWordDelimiterTest() {
    String str = "ThisWORD1isWORD2aWORD3String";
    String delimiter = "WORD1|WORD2|WORD3";
    String[] splitStrings = str.split(delimiter);
    for (int i = 0; i < splitStrings.length; i++) {
    System.out.println("[" + i + "] "+ splitStrings[i]);
    }
    }
    
    The output of this method will be
    [0] This
    [1] is
    [2] a
    [3] String
    
  11. Keeping a watch on a few special characters
  12. As we discussed in previous scenarios, the delimiter field is a regular expression and therefore there are few characters which have special meanings. If any such character needs to use as a delimiter then an escape sequence of \\ should be used in delimiter string. Some example characters are a pipe (|), dollar sign ($), dot (.) carat (^) See below example code.
    public static void  specialCharaterDelimiterTest() {
    String str = "This|is^a$String";
    String delimiter = "\\||\\^|\\$";
    String[] splitStrings = str.split(delimiter);
    for (int i = 0; i < splitStrings.length; i++) {
    System.out.println("[" + i + "] "+ splitStrings[i]);
    }
    }
    
    The output of this method will be
    [0] This
    [1] is
    [2] a
    [3] String
    
    You can try many other regular expression and explore the power of this method. Visit http://www.regular-expressions.info/ to get a more in-depth understanding of regular expressions.
  13. String Split using Java 1.1 Style - Using StringTokenizer class
StringTokenizer is a legacy class which is part of JDK since version 1.1. The String Tokenizer class allows an application to break a string into tokens. The set of delimiters (the characters that separate tokens) may be specified either at creation time or on a per-token basis. StringTokenizer's nextToken() method can be used to produce the exact same output, See below example code.
public static void splitByTokenizer() {
StringTokenizer st = new StringTokenizer("This,is:a;String",",:;");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
}
The output of this method will be
This
is
a
String
The split() method has been introduced since JDK version 1.4 and StringTokenizer use has been discouraged after that. Here is what Sun suggest in JDK 1.6 Javadocs "StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead." Let me know your comments/feedback on this tutorial.
Playing with Java String Split method , Java String split, Java String splitting methods, Java String split functions, String split,play with String split java, splitting string in java exmaple, sample splitting in java, split in java,character split java word split java, all split java, split all spaces in java, split all spaces in string, split left spaces in string object, split right side spaces in java string, split all side spaces in java string, split inner spaces in java string, split in between spaces in java string, split comma, split pipe, split colon, split escape sequences, java string split function, split functions, java String split functions, Playing with Java String Split Function Basics, Playing with Java String Split method Basics, Playing with Java String Split Function fundamentals, Split using Java 1.1 Style - Using StringTokenizer class, programming course, programming tutorial, learn java string split function quickly, understand java string split function in detail, java string split in depth knowledge, string class method, string class function, string class manipulation, split string, how to split a string in java

Post a Comment Default Comments

  1. Hi,

    You've a problem in your 3rd example, the output must be :

    [0] This
    [1] is
    [2] a
    [3] String

    and not

    [0] D
    [1] sM
    [2] sTh
    [3] String

    ;)

    ReplyDelete
  2. @Baptiste Thanks for pointing that out. I have corrected it now. Let me know if you have any other comments?

    ReplyDelete
  3. Small remark: you're escaping your special characters (|, ^ and $) twice, effectively escaping the escape character.

    BTW, this form doesn't allow cut/copy/paste, or using home/end/ctrl/shift/arrows... What's up with that? (Using latest Firefox.)

    Excellent post for newbies.

    ReplyDelete
  4. @Anonymous - 1. The two escape characters are required here one is for java string escaping and other is for regex engine. If we use only one escape character the Java program will not compile. Let me know if you need more details.

    2. I have seen that problem too sometimes. I just tried searching on google for this issue and found there is a malware which affects firefox. Please check this link for details and do let me know if you still see the problem.

    Firefox Copy & Paste Bug

    ReplyDelete
  5. This comment is from IE6.0 :
    checking the copy paste bug on IE6.0. It seems to be working fine. No issues with copy/paste and home/end keys work perfectly alright. Need to check some other browsers too.

    ReplyDelete
  6. Looks like I am able to reproduce this issue from my Firefox 3.5.3. The issue seems to be reproducible only if I logout from my google account and try to enter comment. IE6.0 was looking fine without login to google account.

    Here are the issues
    1. Home key doesn't work.
    2. Array keys dont work.
    3. right click menu does not show any copy paste options.
    4. Cannot do select all using "Ctrl+A"

    Readers:
    Till I figure out the solution for this issue please try to use already logged in google account in case you want to do copy paste and other keyboard activities in comment textarea.

    ReplyDelete
  7. This is a comment from Google Chrome 3.0.195.21

    The comment form seems to be working fine. No issue even when not logged in to google account.

    ReplyDelete
  8. What happens if the delimiter is not found in the String to split?

    For example:

    String str = "Hello World";
    String[] splitStrings = str.split("*");

    What is going to be the value of splitStrings?

    Thanks in advance.

    ReplyDelete
  9. @Anonymous - If the delimiter is not found then original String should be returned as it is.

    So in case of your example the splitStrings array size will be 1 and splitStrings[0] will have value "Hello World".

    ReplyDelete
  10. Thanks for your answer, this is very useful.

    Actually, when I tried my example above (posted on 10/31/2009), I found out that * and + are reserved characters for regex, and they need to be enclosed in [] for the method to work - otherwise the following exception is thrown:

    java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0

    ReplyDelete
  11. You are right. Any regular expression special character needs to be escaped inside Java String. For example for a String

    "This|is+a$String"

    if we use delimiter expression as

    "\\||+|\\$"

    then its going to throw below error.

    Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 3
    \||+|\$
    ^
    at java.util.regex.Pattern.error(Unknown Source)
    at java.util.regex.Pattern.sequence(Unknown Source)
    at java.util.regex.Pattern.expr(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.util.regex.Pattern.in it(Unknown Source)
    at java.util.regex.Pattern.compile(Unknown Source)
    at java.lang.String.split(Unknown Source)
    at java.lang.String.split(Unknown Source)
    at RegExTest.specialCharaterDelimiterTest(RegExTest.java:59)
    at RegExTest.main(RegExTest.java:53)


    Instead if we escape the plus(+) character using double slashes like this :

    "\\||\\+|\\$"

    then proper result will be displayed.

    ReplyDelete

Individuals who comment on FromDev at regular basis, will be rewarded in Top Commenter section. (Comments are selectively moderated so please do not spam)

emo-but-icon

...

item