Anitomy is a great matcher and there's also my engine for another way of doing things, admittedly a simple proof-of-concept (though I have an almost-finished refactored version in ES6, with a 98% success rate, weeee!)
That being said, the test data is honestly the most useful part of mine. I've thrown the data on GitHub in the hopes that other devs can use it to improve their matchers. It's UTF-8 with a single line per filename, I'm thinking of adding more filenames to it, maybe asking some private trackers if they'd be willing to provide me with more data.
I guess what I'm saying is, make some gr8 shit dennis