1. Computing
Michael Morin

Verbal Expressions, Useful or Added Weight?

By September 30, 2013

Follow me on:

I've recently come across the concept of Verbal Expressions, a kind of domain specific language for regular expressions.  They draw out regular expressions into small chunks that are english-like, composable and testable.  While I have mixed feelings about this, I do see them shedding light on how we go about writing regular expressions.

It's a bit difficult to explain, so I'll just show you.  This is an example given by RubyVerbalExpressions:

tester = VerEx.new do
  start_of_line
  find 'http'
  maybe 's'
  find '://'
  maybe 'www.'
  anything_but ' '
  end_of_line
end

This is roughly equivalent to the regular expression /^https?://(www.)?[^ ]+$/, in other words it just matches a URL.  Now I've been using regular expressions for 15 years or so now (the first time I learned about them with the UNIX tools and then Perl, I was utterly blown away at how useful they were) so that regular expression is perfectly fine to me.  It doesn't need to be made more readable.  And in order to use the verbal expression version you'll have to have some understanding of the regular expressions anyway.  Ultimately I see verbal expressions as an unnecessary complication to something that works just fine.  However, they did make me realize something.

We (or at least I) have been going about regular expressions all wrong in a test-driven development sense.  My regular expressions are just monolithic and largely untested chunks.  I understand them well, I'm confident I wrote them correctly, why test them?  Sounds like a recipe for disaster.  Instead, break your regular expression into chunk.  Test each chunk.  Compose them at runtime using Regexp.compile.  This also allows you to reuse parts of your regular expressions and you code becomes more DRY.  This might not be a huge part of your programs, but it is something that can be improved on.

Comments
October 16, 2013 at 2:50 am
(1) Peter says:

As you wrote: you’ve been writing regex for 15 year so it’s “normal” that you understand it well. I (try to) use it about 3 or 4 times / year in Notepad++ to do some intelligent changes to texts and something like these verbal expressions would make this more readable to me.

Leave a Comment

Line and paragraph breaks are automatic. Some HTML allowed: <a href="" title="">, <b>, <i>, <strike>

©2014 About.com. All rights reserved.