Managing user input

When dealing with user input, Murphy’s Law will always apply in full force: “Anything than can go wrong will go wrong”. Or as Douglas Adams would have it:

“A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools”

This is a common crux for us developers: how do we architect a completely foolproof system? There are many routes you can take to do this, but in this particular blog post, I would like to focus on a specific aspect of this: managing user inputs.

This applies specially to circumstances where you give the user full power over an input, say a text field, asking them to give you a specific piece of information. The problem lies in the fact that you cannot know what the user will input. As far as you’re concerned, you might be asking them for a telephone number, and they might give you their name. It’s not unheard of hackers exploiting this mechanism to inject SQL code into a database. Going into the less extreme scenarios, a user might have accidentally typed in a double space somewhere in his address, which might bring down all types of bugs down the line, hypothetically speaking, when you ask them to confirm their address against what’s in your database.

Moral is: as a developer, it’s your duty to always check against any information provided by a user. This serves a twofold purpose:

  • You make sure that the data provided by your users is valid, and offer opportune feedback otherwise.
  • You protect yourself against malicious attacks.

Fortunately, all you need for this is a simple string validation. So let’s go ahead and create an extension for our String objects and call it: “String+Cleanups”

Where things can go wrong:

I can come up with a few hypothetical scenarios where things might go wrong with user input:

  • Incorrect use of spaces, like double spacings, or spaces at the beginning or end of an input.
  • Incorrect use of return characters, like double returns, or returns at the beginning or end an input.
  • Use of diacritics in a diacritic sensitive context, like trying to store an ASCII compliant password in a database, this specially applies to non-english languages.
  • Incorrect character usage, say using the ‘*’ character when typing a name.

Having identified this, we can think of a strategy to counteract this, such as:

  • We validate user input as the user is typing depending on our character requirements
  • We do a cleanup check the moment users stop interacting with editable fields.

This takes us into the meaty part of our blog: the code…

Cleaning up those strings

Because we are dealing with strings in Swift, we need to consider static and variable variants of our type. This means that we will end up with a suit of duplicate functions for managing different types of cleanups, one suit of mutating and another of returning functions.  As you will see below, the mutating functions will always leverage the static ones in order to change the value of “self”, allowing us to develop a constant pattern that we can use to approach this problem in an optimized manner.

 

Trimming

For our first case, let’s create a function that will trim (this means remove from the front and tail) a string to make sure there are no extraneous spaces ‘ ‘ or return ‘\n’ characters. This looks a little something like this:


var trimmedString : String {
  return self.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet());
}

mutating func trim() {
  self = self.trimmedString
}

This is fairly straightforward and is basically just a shortcut wrapper for a common iOS Framework utility. Prefixing the second function with the “mutating” keyword allows us to change the value of the string within the method’s implementation, making the function variable complaint. The nice thing about this is that this in turn will allow us the following usage during development:


var string = " hello world\n "
let trimmed = string1.trimmedString
print(trimmed) //prints "hello world"
string.trim() //'string' is now: "hello world"

Cleaning

Now that we have this nice little shortcut, we need to start thinking of complexity. So… what if the user has added some extraneous spaces or return carriages in between the words? This can be solved easily using the following functions:


var cleanString : String {
var lines = self.componentsSeparatedByCharactersInSet(NSCharacterSet.newlineCharacterSet()).filter({ !$0.isEmpty })
  for var index = 0; index < lines.count; index++ {
    let words = lines[index].componentsSeparatedByCharactersInSet(NSCharacterSet.whitespaceCharacterSet()).filter({ !$0.isEmpty })
    lines[index] = words.joinWithSeparator(" ")
  }
  return lines.joinWithSeparator("\n")
}

mutating func cleanup() {
  self = self.cleanString
}

By first breaking up our paragraph into each line and filtering out empty components, we then repeat the same process for each word in each line and joint them back together. Even though you could consider this a bit of an overkill function, it will ensure us that we the input we get from a user is nice and tidy, for example:


var string = "\n\n This is a string with lots of double spaces. \n\n And a newline.\n Or two. "
let cleaned = string.cleanString
print(cleaned) //prints "This is a string with lots of double spaces.\nAnd a newline.\nOr two."
string.cleanup() //string is now "This is a string with lots of double spaces.\nAnd a newline.\nOr two."

Normalizing

Normalizing a string is a bit more obscure. In order to do this we need to “fold” each character into its non-diacritic, lowercase equal. Which means all composed character sequences like “Æ” or “ä” with be converted to the normalized, ASCII compliant equivalent “a”. We can do this easily with this function:


var normalizedString : String {
  return self.trimmedString.stringByFoldingWithOptions([.CaseInsensitiveSearch, .DiacriticInsensitiveSearch], locale: NSLocale.currentLocale())
}

mutating func normalize() {
  self = self.normalizedString
}

Note that we are using our “trimmedString” function to cleanup the string before normalizing. This is to ensure best practice usage moving forward. A usage example is as follows:


var string = " Ècho\n\n "
let normalized = string.normalizedString
print(normalized) //prints "echo"
string.normalize() //string is now "echo"

Constraining

Last but not least, we can always enforce a specific character set usage by removing all other invalid characters from a string, in order to do this we will need two functions, one for checking character usage in a string, and one for removing invalid character usages:


//validation
func containsOnlyCharactersInSet(characterSet: NSCharacterSet) -> Bool {
  return self.rangeOfCharacterFromSet(characterSet.invertedSet, options: [NSStringCompareOptions.CaseInsensitiveSearch, NSStringCompareOptions.DiacriticInsensitiveSearch], range: self.startIndex..<self.endIndex) == nil
}

mutating func cleanupToContainOnlyCharactersInSet(characterSet: NSCharacterSet) {
  self = stringContainingOnlyCharactersInSet(characterSet)
}

As you can see, we want our comparisons to be diacritic and case-insensitive. This is to make sure that we don’t get bogus behaviours when we have non-standard usage of characters. As always, we have our static and mutating utility functions, which we can use as follows:


//validation
let string = "This Contains Only Letters."
string.containsOnlyCharactersInSet(NSCharacterSet.decimalDigitCharacterSet()) //returns false

//constraining strings
var string = "This Contains Letters and Numbers 1234567890."
let constrained = string.stringContainingOnlyCharactersInSet(NSCharacterSet.decimalDigitCharacterSet())
print(constrained) //prints "1234567890"

string.cleanupToContainOnlyCharactersInSet(NSCharacterSet.uppercaseLetterCharacterSet()) //string is now TCLN

Feel free to download the full utility class from my GitHub, where you will be able to find the companion code for this blog post and many other nice and easy to use development utility classes.

Bon chance & dev on!

Author: Danny Bravo

Director @ EPIC