« Notes From a Conversation With Political Activist Josh Silver (Represent.us) | Main | An Arduino Relaxation Tool Using Two Cell Phone Motors »
Friday
Sep252015

Brief report: Boston Python User Group: Favorite Libraries Meetup

python logo

Last night I attended a favorite libraries Python Meetup organized by The Boston Python User Group (@bostonpython) and hosted by Akamai. Here I'll give a brief summary of the libraries covered, and a newcomer's perspective on the event. The presentations overall were great - focused, not too long, with code examples and demos. And of course the theme was a winner - who doesn't love learning about a tasty library that may come in handy? Fun demos make it even better. I'm looking forward to attending more of them.

Neil Tenenholtz: mrjob

mrjob is a library for writing MapReduce jobs (Hadoop tutorial [here](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.htm l), original Google paper here). Neil did a live demo showing how little code is needed to express paralellizable tasks, at the simplest, just a mapper() and reducer() function. He demo'd wordcounta and showed how to run it on different clusters, including simulated, local, and Hadoop or Amazon Elastic MapReduce installations. At one of the UMass labs I worked in, we used Hadoop in a number of ways, including batch data processing and cleanup (i.e., ETL) for the machine learning algorithms, and parallelized SQL via Cloudera Impala. Yummy stuff!

Lindsay Raymond: Funcy

Funcy adds to Python some functional tools "inspired by clojure, Underscore.js, and [his] own abstractions." (They reminded me a bit of the Scala idioms I saw in my brief exposure to it for a Spark GraphX project I did to implement fast graph path searches. Ask me sometime about my take on Scala as compared to Python ;-) Lindsay gave some examples, and it was clear that there's a lot more to the library than can be shown during a short demo, but she touched on concepts like how readable elegantly composed functions are, and how testing is eased by breaking computation down into stand-alone pieces. (As an XP fan, I really appreciated this point.) While I've done less functional programming than perhaps I'd like, funcy got me interested in learning more.

Scott Sanderson: Click

click logo

Click is a package that simplifies writing command-line interfaces. Scott demonstrated (with humor :-) the main features, showing the decorator-based style it uses. Oftentimes I end up writing command line tools for applications that don't need a UI or for ones that can be scripted, and I'll definitely have a look at Click the next time the need comes up. You can read the author's motivation for writing an alternative to the inbuilt argparse module here, and learn how it compares to others at Comparing Python Command-Line Parsing Libraries - Argparse, Docopt, and Click.

Amandalynne Paullada: NLTK

Natural Language Toolkit is a new favorite of mine, and I was excited to see Amandalynne's presentation. The library is rich in features, including word and sentence segmentation, part-of-speech tagging, classifiers, information extraction, and more. Amandalynne demo'd a clever nickname generator (e.g., https://en.wikipedia.org/wiki/Text_segmentation) which, as you'd expect, was a hit. Even (or especially) when it surprised. I'm currently evaluating it for a skeptical toolbox idea I have, using it for named-entity recognition, for example. (I learned a little about this area in my last contract; it's surprisingly difficult.) There's a very well done NLTK book, which I came away from reading thinking that it would be a great introduction to Python for newbies. This is because learning the language is set in the context of text processing, which is inherently cool. Good stuff.

Ned Jackson Lovely: itsdangerous and bcrypt

itsdangerous logo

itsdangerous is a library for doing cryptographic signing, and can be applied to cases like secure "amnesia" ("forgotten password") emails or saving protected cookies. Using the module this way can, in some cases, eliminate the need to store hashed data in server-side tables. bcrypt is a tool for password hashing. Not having done work in this area, I can't say much about these libraries, but I fully believed Ned's point about DIY cryptography being hard to get right, and about getting professionals involved for anything important, such as protecting credit card information.

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.