Sanitizing User Input, Part II (Validation with Spring REST)

In Part I of sanitizing user input, we looked at the how, why, and when of sanitization. In Part II, we will look at one technique for validating at time of input, thus reducing the problem to a validation problem. For the context of this question, we will be using Spring MVC as the RESTful endpoint, with domain objects automatically marshalled from a JSON request body.

The problem with sanitizing user input is where to do the sanitizing. It’s generally a good idea to validate input as early as possible, so it makes sense to do something like this in the controller layer. We could grab the domain object given in the controller method argument, extract all the strings we care about, and validate them one by one. This might be straightforward and easy to implement for one or two domain objects, but for a respectable system of any size, this approach would not scale very well. Let’s take a different approach.

Imagine if you will, a world in which we can detect malicious input and reject it as invalid just as we would any other invalid input. Sanitizing would then simply be a validation problem. Importantly, validation is a solved problem with known frameworks. In this case we can leverage JSR 303 and Spring’s support for applying validation annotations to controller arguments.

Here are the steps to accomplish this:

  1. Define an annotation to be applied to an Entity’s field. This annotation will be linked to a validation implementation.
  2. Define a custom validation constraint to do the actual safety check, we can leverage the OWASP Java HTML Sanitizer. The sanitizer will sanitize the string, and the validation check will just be to see that the sanitized version is the same as the original version (thus showing that it does not contain content against your security policy).
  3. Apply the validation annotation to the Entity’s field.
  4. Apply the @Valid annotation to the Controller argument, include the BindingResult in the Controller method, and check it inside the method for finer control over sanitized input failures

Ok, step 1: Define the annotation to be applied to the field. This is part of the standard Validation API. Note the @Constraint annotation applied to @NoHtml, and how it references a ConstraintValidator.

@Target({FIELD})
@Retention(RUNTIME)
@Constraint(validatedBy = NoHtmlValidator.class)
@Documented
public @interface NoHtml {
    // TODO use a better message, look up ValidationMEssages.properties
    String message() default "{org.myproject.constraints.nohtml}";
    Class<?>[] groups() default {};
    Class<? extends Payload>[] payload() default {};
}

Step 2: The validation to be applied to fields with the specified annotation would look like this. Note that this class extends ConstraintValidator, and fits in with the standard Validation API. We can refine the behavior of this implementation by providing annotation values, and accessing them in the ConstraintValidator’s initialize() method.

import javax.validation.ConstraintValidator;
import javax.validation.ConstraintValidatorContext;
import org.owasp.html.HtmlPolicyBuilder;
import org.owasp.html.PolicyFactory;

public class NoHtmlValidator implements ConstraintValidator<NoHtml, String> {

   // http://owasp-java-html-sanitizer.googlecode.com/svn/trunk/distrib/javadoc/org/owasp/html/HtmlPolicyBuilder.html
   // builder is not thread safe, so make local
   private static final PolicyFactory DISALLOW_ALL = new HtmlPolicyBuilder().toFactory();

   @Override
   public void initialize(NoHtml constraintAnnotation)
   {
      // TODO specify the policy as an annotation attribute
      // to use them, values from annotation are stored in private properties here
   }

   @Override
   public boolean isValid(String value, ConstraintValidatorContext context)
   {
      String sanitized = DISALLOW_ALL.sanitize(value);
      return sanitized.equals(value);
   }
}

Step 3: Apply the custom validation constraint to the Entity. For instance, if we had a User class, we can add the @NoHtml annotation to its display name because we know that it can be displayed on pages to other users.

@Entity
public class User {

   @Basic
   @NotNull
   @NoHtml
   @Size(min = 3, message = "must be at least three characters")
   private String displayName = "";

   ...
}

Step 4: At this point if we attempt to save an object with this validation annotation, and the annotated field contains html, the persistence will fail with a validation exception. This is a good first step. We can now apply input validation at any point in our domain without duplicate or excess code in our domain classes. The validation is self-contained and easy to test.

However, the exception thrown is a validation exception which is not discovered until Hibernate tries to commit the object. On principle it would be nice to have the input checked as early as possible; at the controller layer. Also at that point we could prevent the persistence-level exception and control the response going back to the client directly.

To do this, we need to add @Valid to the incoming object which was unmarshalled by Spring, and include the BindingResult in the method signature. With those in place we can detect validation errors (including sanitizing checks and all other validation) and return a more appropriate response. In the sample below, an IllegalArgumentException is thrown and a Spring exception handler (not shown) resolves that to a 400 response. There are other ways to accomplish this as well, but the point here is to show where and how the validation error can be intercepted in the controller.

@Controller
@RequestMapping(value="/resources/user")
public class UserResource {

   private UserService userService;

   @Inject
   public UserResource(UserService us) {
      userService = us;

   }

   @RequestMapping(value="/create", method=RequestMethod.POST, produces={"application/json"})
   public @ResponseBody Resource<User> createUser(@RequestBody @Valid User newUser, BindingResult binding) throws UnsupportedEncodingException
   {
      if(binding.hasErrors()) {
         // catch this and return a 400 response from your general handler
         throw new IllegalArgumentException();
      }

      // create new User here...
   }
}

And there we have it! Input sanitization solved as an input validation problem! Hopefully this is helpful to you. And if you remember nothing else from this post, remember: Assume All User Input Is Evil! 

Advertisements

Leave a comment

Filed under Software Engineering

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s