I've recently come across the concept of Verbal Expressions, a kind of domain specific language for regular expressions. They draw out regular expressions into small chunks that are english-like, composable and testable. While I have mixed feelings about this, I do see them shedding light on how we go about writing regular expressions.
It's a bit difficult to explain, so I'll just show you. This is an example given by RubyVerbalExpressions:
tester = VerEx.new do start_of_line find 'http' maybe 's' find '://' maybe 'www.' anything_but ' ' end_of_line end
This is roughly equivalent to the regular expression /^https?://(www.)?[^ ]+$/, in other words it just matches a URL. Now I've been using regular expressions for 15 years or so now (the first time I learned about them with the UNIX tools and then Perl, I was utterly blown away at how useful they were) so that regular expression is perfectly fine to me. It doesn't need to be made more readable. And in order to use the verbal expression version you'll have to have some understanding of the regular expressions anyway. Ultimately I see verbal expressions as an unnecessary complication to something that works just fine. However, they did make me realize something.
We (or at least I) have been going about regular expressions all wrong in a test-driven development sense. My regular expressions are just monolithic and largely untested chunks. I understand them well, I'm confident I wrote them correctly, why test them? Sounds like a recipe for disaster. Instead, break your regular expression into chunk. Test each chunk. Compose them at runtime using Regexp.compile. This also allows you to reuse parts of your regular expressions and you code becomes more DRY. This might not be a huge part of your programs, but it is something that can be improved on.