Groovy script showing most active participants on the Grails mailing list.

I decided it would be interesting to see who the most active participants were on the Grails mailing list, since I started subscribing to it back on Feb 2nd 2010.

Today is April 1st 2010. So this result is for the past 58 days. As you can see at the bottom, I crawled 4468 messages.

Of those 55 didn’t match my criteria for being a mailing list entry, because the subject did not contain ‘[grails-user]’ or ‘[via Grails]’.

So that’s 4413 messages. An average of 80+ messages per day.

The date of the most recent message was saved against the user. Then each time a post matched the ‘from’ user I incremented a count.

At the end I discovered there had been 603 participants.

I sorted this by number of posts descending, date of most recent post descending, then from name ascending.

I saved the consolidated data and extract names, email addresses, date of most recent post (YYYYMMDD  format) to an XML extract ready to go in search of blogs and connect with people.

I then set a threshold to 30 to print out the any participant who had posted this amount or more messages to the mailing list.

Now I didn’t write this script to distinguish between messages that were questions, answers or announcements.  But you can see Burt Beckwith came out as star of the show with 165 posts (2.8+posts per day).

Congrats to everyone on the list for making this such a vibrant community.

Now on to the process involved in doing the crawl.

Here’s a snapshot of my folders. It’s the Grails User folder that gets navigated to and crawled once I’ve logged in to my Yahoo email account:


A typical row in a Yahoo Data table:

A sanity check. There are really 4468 messages. There seems to be a Yahoo caching issue. It’s the next link that gets clicked by the Selenium automation process

Here is the script:




Here is the output from running the program:

Advertisements

About this entry