Code Puzzle: Anagram Detector

Continuing with the series on code puzzles, here is another question: Given two input Strings, how can you detect if they are anagrams of each other (that they have exactly the same set of letters although not necessarily in the same order)? I have seen this used as an interview question during technical interviews. There are a few ways to implement this, and I think it’s a little bit richer of a problem then the string reversal problem. Let’s take a look.

An interface might look something like this:

public interface AnagramDetector {
   boolean isAnagram(String input1, String input2);
}

To approach this we might ask how we would do it by hand. First of all we can instantly recognize something that is NOT an anagram: two strings of different lengths by definition must not have exactly the same amount of each letters, so we can eliminate that condition immediately.

Continuing onward, we might count the number of occurrences of the letter “A” in each string and compare the count, then count the number of “B’s” and compare the count, and so on. As with the previous code puzzles we looked at, this can be implemented with primitives or objects, with custom algorithms or standard libraries. In this case I opted to use Objects so that I could leverage the standard library and use Collections.frequency(). Certainly if we wanted we could scan each String’s char[] by int index and write our own frequency method, but I thought this technique was clearer.

Finally, we only scan for characters that we know occur in the String. If we get the set of unique characters from the first string, and the second string was longer than the first, and we didn’t eliminate the input string pair earlier based on different length, this would cause a problem because we might miss characters that only appear in the second string (and get a false positive on the detection). But because we eliminated the condition of the two strings having different lengths, obtaining the unique characters in this way is OK.

Such a solution might look like this (null checking omitted for clarity).

import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Set;
import java.util.TreeSet;
import org.apache.commons.lang.ArrayUtils;

public class AnagramDetectorScanner implements AnagramDetector {
   @Override
   public boolean isAnagram(String input1, String input2) {
      
      // return immediately if by definition it can't be an anagram
      if(input1.length() != input2.length()) {
         return false;
      }

      // get unique characters so you don't scan for characters that aren't there
      Set<Character> uniqueCharacters = new TreeSet<>(toCharacters(input1.toLowerCase()));

      // use a Collection of Characters so we can leverage Collections.frequency() below
      List<Character> input1Set = toCharacters(input1.toLowerCase());
      List<Character> input2Set = toCharacters(input2.toLowerCase());
      
      // scan each string for each character
      // if the counts are different, they are not anagrams
      for(Character uniqueCharacter : uniqueCharacters) {
         // char is autoboxed, no need to convert char to Character myself
         char uniqueChar = uniqueCharacter.charValue();
         int count1 = Collections.frequency(input1Set, uniqueChar);
         int count2 = Collections.frequency(input2Set, uniqueChar);
         if(count1 != count2) {
            return false;
         }
      }

      return true;
   }
   
   private List<Character> toCharacters(String input) {
      return Arrays.asList(ArrayUtils.toObject(input.toLowerCase().toCharArray()));
   }
}

Finally of course this is the kind of code that lends itself to TDD, so a test should be part of what we write.

There are some other approaches to the anagram problem as well that we can explore in more detail next time.

Advertisements

Leave a comment

Filed under Software Engineering

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s