Archive for the ‘Regex’ Category

Debuggex [Emacs Alternative, Others?]

Saturday, April 13th, 2013

Debuggex: A visual regex helper

Regexes (regular expressions) are a mainstay of data mining/extraction.

Debuggex is a regex debugger with visual cues to help you with writing/debugging regular expressions.

The webpage reports full JS regexes are not yet supported.

If you need a fuller alternative, consider debugging regex expressions in Emacs.

M - x regexp-builder

which shows matches as you type.

Be aware that regex languages vary (no real surprise).

One helpful resource: Regular Expression Flavor Comparison

Working with Pig

Saturday, February 16th, 2013

Working with Pig by Dan Morrill. (video)

From the description:

Pig is a SQL like command language for use with Hadoop, we review a simple PIG script line by line to help you understand how pig works, and regular expressions to help parse data. If you want a copy of the slide presentation – they are over on slide share http://www.slideshare.net/rmorrill.

Very good intro to PIG!

Mentions a couple of resources you need to bookmark:

Input Validation Cheat Sheet (The Open Web Security Application Project – OWASP) – regexes to re-use in Pig scripts. Lots of other regex cheat sheet pointers. (Being mindful that “\” must be escaped in PIG.)

Regular-Expressions.info A more general resource on regexes.

I first saw this at: This Quick Pig Overview Brings You Up to Speed Line by Line.

C++11 regex cheatsheet

Sunday, July 22nd, 2012

C++11 regex cheatsheet

A one page C++11 regex cheatsheet that you may find useful.

Curious though, how useful do you find colors on cheatsheets?

Or are there cheatsheets where you find colors useful and others not?

If so, what seems to be the difference?

Not an entirely idle query. I want to author a cheatsheet or two, but want them to be useful to others.

At one level, I see cheatsheets as being extremely minimalistic, no commentary, just short reminders of the correct syntax.

A step up from that level, perhaps for rarely used commands, a bit more than bare syntax.

Suggestions? Pointers to cheatsheets you have found useful?

ack

Tuesday, October 18th, 2011

ack

From the webpage:

ack is a tool like grep, designed for programmers with large trees of heterogeneous source code.

ack is written purely in Perl, and takes advantage of the power of Perl’s regular expressions.

It is said to be “pure Perl” so Robert shouldn’t have any problems running it on Windows. ;-)

Seriously, the more I think about something Lars Marius said to me years ago, about it all being about string matching, the more that rings true.

Granting that we attach semantics to the results of that string matching but insofar as our machines are concerned, it’s just strings. We may have defined complex processing for strings, but they remain, so long as they are not viewed by us, simply strings.

(What I remember of conversations, remarks is always subject to correction by others who were present. I am sure their memories are better than mine.)