Count occurrences of substring in input string

1. Introduction

In this article we will count occurrences of substring in input string. Below is an exmaple.

Example:
String str = "banana";
String substring = "an";

Output = 2

In word "Banana", "an" is repeated twice.
Example:
String str = "banana";
String substring = "xy";

Output = 0

In word "Banana", "xy" is not present at all.

2. So, how to write a code for this problem?

Let us define the scenarios as below:

ScenarioString inputString substringOutput
Scenario 1nullnot null-1
Scenario 2not nullnull-1
Scenario 3Nullnull-1
Scenario 4not nullnot null>=0

Scenario 1 Scenario 2 and Scenario 3 can be translated as what if either of input is null?

if (str == null || subString == null) {
    return -1;
}

What if the length of substring is greater than input string?

if (subString.length() > str.length()) {
    return 0;
}

If the substring length is greater than input string length then return 0 as there won’t be any matching substring due to its length.

Now last phase. Count number of occurrences if they exist in input string.

int count = 0;
int pos = 0;
int index = 0;
while ((index = str.indexOf(subString, pos)) != -1) {
    count++;
    pos = index + subString.length();
}
  • Let us set count and current position to 0
  • Take index = str.indexOf(subString, pos)
  • If index == -1  then there is no such string, return
  • Else increment count as count++ and set new position as pos = index + subString.length();
  • Check if there is any other such substring in input string.

Entire code is as below.

public static int countOccurences(String str, String subString) {
    if (str == null || subString == null  || subString.isEmpty()) {
        return -1;
    } else if (subString.length() > str.length()) {
        return 0;
    }

    int count = 0;
    int pos = 0;
    int index = 0;
    while ((index = str.indexOf(subString, pos)) != -1) {
        count++;
        pos = index + subString.length();
    }

    return count;
}

3. Analysis of countOccurences method:

  1. Time complexity
    1. O(n) is time complexity where n is length of the string which we are scanning to search for substring.
  2. Space complexity
    1. O(1) is space complexity as we are not using any auxiliary storage.

4. Conclusion

In this article, we learned how to find total number of occurrences of substring in a string.

Leave a Reply

Your email address will not be published. Required fields are marked *