Monthly Archives: July 2013

Vagrant For Java: Should we compile on the host or the VM?

Here’s the question: When using Vagrant for a Java project (or any compiled language project for that matter), should we compile in the VM or on the host? Also, would we want your IDE and all your development tools to be run from inside the VM as well, or on the host?

Much of the time Vagrant is advertised for development with PHP, Ruby, Node, etc. For these languages, code is edited on the host and run on the VM. There is no compilation step, so the “edit here / run there” paradigm works very well.

However, it seems to be not very well defined exactly how a Java IDE and the compile/deploy process work with a Vagrant VM. One could make an argument for compiling on the VM or for compiling on the host machine, or even from running the IDE directly inside the VM. Other answers on Stackoverflow have implied that Vagrant is less useful for compiled languages because of the extra compile step, but let’s give it some thought and examine the pros and cons.

Some considerations:

Why have IDE or compile on the VM

  • If compiling on host, Java is one more piece of software to install, and it introduces a complication if you want to use different versions of Java for different projects
  • II compiling on host, the Java version on host must be manually kept up to date with that on the project’s VM. The corresponding Java version on the host might be unavailable (say, on a Mac where historically Java releases have lagged behind those for other platforms)
  • Tighter integration between the compilation environment and the IDE allow you to use shortcuts to compile and run the application
  • We can connect the debugger for Java (non-web) applications without remote debugging (one click run/debug)
  • If everything is on the VM including tools, this provides an instant development environment so new developers can get up and running very quickly

Why have IDE or compile on the host

  • Faster compile times
  • Ensures compile and run are done with the same JDK
  • Better UI performance (X forwarding and VNC are too slow for daily development)

What are your thoughts: Is it better to run the IDE from inside the VM or the host? Should Java be compiled from inside the VM or the host?

Advertisements

Leave a comment

Filed under Software Engineering

Code Puzzle: Anagram Detector III

We have been looking at different approaches to detect anagrams given two strings. In our first detector we repeatedly scanned the strings counting and comparing the occurrences of each character. In our second detectort we sorted each string and compared them to each other. In this third attempt will will look at a more efficient technique.

As usual the complete runnable source is on github.

Let’s take another look at the scanning technique. This approach suffers from the problem of repeated scans. There is an underlying scan to get the unique characters, and then a separate scan for each character. Every time there is a scan and we read a character without doing anything with it, we are passing over the opportunity to use information that is available to us. To remedy this, we can actually analyse each string with a single scan.

To accomplish this, we can store the information about what we see as we scan each character for each string. The best way to store this information is in a map from each character to the number of times we’ve seen that character. For each character that we come across, we just need to increment the count for that character. After we’ve scanned both strings, all we need to do is compare the maps to see if the two strings are anagrams.

Consider the following code:

public class AnagramDetectorFrequencyMap implements AnagramDetector {

   @Override
   public boolean isAnagram(String input1, String input2) {

      // null check and length check omitted for conciseness...

      List<Character> input1Set = toCharacters(input1.toLowerCase());
      List<Character> input2Set = toCharacters(input2.toLowerCase());

      return createFrequencyMap(input1Set).equals(createFrequencyMap(input2Set));
   }

   private Map<Character, Integer> createFrequencyMap(List<Character> inputs) {
      // the count of character appearances
      Map<Character, Integer> frequencies = new HashMap<>();
      for(Character c : inputs) {
         Integer count = frequencies.get(c);
         count = (count == null) ? 0 : count;
         frequencies.put(c, count + 1);
      }
      return frequencies;
   }

   private List<Character> toCharacters(String input) {
      return Arrays.asList(ArrayUtils.toObject(input.toLowerCase().toCharArray()));
   }

}

The best part is that the processing time grows linearly with string size, this is an O(N) solution which is the lowest we are going to get for this particular problem.

Hopefully the anagram exercises were fun and informative! A possibility for follow up versions of the exercise could be to write a functional version with lambdas in Java 8. More on that later…

Leave a comment

Filed under Software Engineering

Code Puzzle: Anagram Detector II

In the last post, we looked at the anagram detector and a solution. This is a nice problem with a few solutions, so let’s take a look at some more.

Another approach as an alternative to looping is to take advantage of the information we have given the nature of anagrams.

We know before we start that a string is an anagram of another if its letters can be arranged to form the other string. So another approach is to actually rearrange the letters and see if we can form the one string from the other. Equivalently, we could also rearrange the letters of both strings to see if each can be made to equal a third string. In that case the transitive property of anagrams would show that the two original inputs are anagrams. We can define this third string to be the string with sorted characters. In other words: sort the two input strings and see if the sorted strings are equal.

public class AnagramDetectorSorting implements AnagramDetector {
   @Override
   public boolean isAnagram(String input1, String input2) {

      char[] input1Chars = input1.toLowerCase().toCharArray();
      char[] input2Chars = input2.toLowerCase().toCharArray();
      Arrays.sort(input1Chars);
      Arrays.sort(input2Chars);
      return Arrays.equals(input1Chars, input2Chars);
   }
}

One clear advantage of this is conciseness and clarity. There are no loops, there is nothing complicated enough to even require a comment. Unfortunately this may be the least efficient of the approaches. Many popular sorting algorithms (e.g. merge sort and quicksort) are O(n log(n)). In the next blog post we will see a more efficient approach.

Leave a comment

Filed under Software Engineering

Code Puzzle: Anagram Detector

Continuing with the series on code puzzles, here is another question: Given two input Strings, how can you detect if they are anagrams of each other (that they have exactly the same set of letters although not necessarily in the same order)? I have seen this used as an interview question during technical interviews. There are a few ways to implement this, and I think it’s a little bit richer of a problem then the string reversal problem. Let’s take a look.

An interface might look something like this:

public interface AnagramDetector {
   boolean isAnagram(String input1, String input2);
}

To approach this we might ask how we would do it by hand. First of all we can instantly recognize something that is NOT an anagram: two strings of different lengths by definition must not have exactly the same amount of each letters, so we can eliminate that condition immediately.

Continuing onward, we might count the number of occurrences of the letter “A” in each string and compare the count, then count the number of “B’s” and compare the count, and so on. As with the previous code puzzles we looked at, this can be implemented with primitives or objects, with custom algorithms or standard libraries. In this case I opted to use Objects so that I could leverage the standard library and use Collections.frequency(). Certainly if we wanted we could scan each String’s char[] by int index and write our own frequency method, but I thought this technique was clearer.

Finally, we only scan for characters that we know occur in the String. If we get the set of unique characters from the first string, and the second string was longer than the first, and we didn’t eliminate the input string pair earlier based on different length, this would cause a problem because we might miss characters that only appear in the second string (and get a false positive on the detection). But because we eliminated the condition of the two strings having different lengths, obtaining the unique characters in this way is OK.

Such a solution might look like this (null checking omitted for clarity).

import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Set;
import java.util.TreeSet;
import org.apache.commons.lang.ArrayUtils;

public class AnagramDetectorScanner implements AnagramDetector {
   @Override
   public boolean isAnagram(String input1, String input2) {
      
      // return immediately if by definition it can't be an anagram
      if(input1.length() != input2.length()) {
         return false;
      }

      // get unique characters so you don't scan for characters that aren't there
      Set<Character> uniqueCharacters = new TreeSet<>(toCharacters(input1.toLowerCase()));

      // use a Collection of Characters so we can leverage Collections.frequency() below
      List<Character> input1Set = toCharacters(input1.toLowerCase());
      List<Character> input2Set = toCharacters(input2.toLowerCase());
      
      // scan each string for each character
      // if the counts are different, they are not anagrams
      for(Character uniqueCharacter : uniqueCharacters) {
         // char is autoboxed, no need to convert char to Character myself
         char uniqueChar = uniqueCharacter.charValue();
         int count1 = Collections.frequency(input1Set, uniqueChar);
         int count2 = Collections.frequency(input2Set, uniqueChar);
         if(count1 != count2) {
            return false;
         }
      }

      return true;
   }
   
   private List<Character> toCharacters(String input) {
      return Arrays.asList(ArrayUtils.toObject(input.toLowerCase().toCharArray()));
   }
}

Finally of course this is the kind of code that lends itself to TDD, so a test should be part of what we write.

There are some other approaches to the anagram problem as well that we can explore in more detail next time.

Leave a comment

Filed under Software Engineering