A little Groovy script to parse your LinkedIn contacts
I intend to improve this a bit and convert it into a web crawler at some point, but I took my LinkedIn contacts page and analysed the HTML markup using Firefox and the Firebug plug-in.
LinkedIn Contacts analysed in Firefox Firebug
If you look the UL node, the third line from the bottom of the image above, with the class of “conx-list” this is the node where all the good stuff resides.
If we look at the next screenshot you will see there is a “letter divider” LI node, followed by a series of LI nodes, all with numeric id’s. That’s the location of the nodes pertaining to each individual contact.
First contact. It's the LI node with a numeric id attribute after the letter divider
I added the NekoHTML and latest XercesImpl JARs to the classpath in the GroovyConsole. You can do this from the Script Menu.
I saved the page off to my desktop as a complete web page, so that I didn’t have to contend with automating a login form and used Groovy’s excellent XMLParser to format an XML results extract of the pertinent data.
Groovy LinkedIn contact parsing Script
Some other links that helped me were:
Groovy in Action