the further adventures of

Mike Pirnat

a leaf on the wind

Next Page »

Inside the PyOhio Program Process

It has been my privilege to serve as the Program Committee Chair for PyOhio 2018 and PyOhio 2019, and I'll probably do it again for 2020. I've had a lot of positive feedback about the program we've put together these past two years, and several folks suggested that it would be worth sharing a look at my process. It seemed like a good idea at the time, so here's a peek behind the scenes of our activities and all that goes into talk selection and scheduling for our conference.

Before I begin, I want to say a couple things about PyOhio that I'm really proud of. First, PyOhio is the longest continually-running regional Python conference! We just had our twelfth year and it just keeps getting better. Second, PyOhio is the only regional Python conference that's completely free to attend, thanks to our generous sponsors. Third, all of our organizers and speakers are volunteers, to whom I am extremely grateful. Last, but certainly not least, PyOhio is proud to be a welcoming and inclusive event, a platform for a diversity of voices, and a launching pad for new speakers. It's really an honor to be leading the curation of this very special event, and I'd love to have you--yes, YOU!--make my job even more difficult next year by burying me under great talk proposals.

My first order of business is making sure we have a committee. I like to have three of us in total so that we can have different perspectives and break ties if there's contention, while also not being impossible to schedule time together (being an adult is hard, y'all).

The Program Committee's activities are tightly bound to the calendar, so it will help to start with an understanding of our timeline. PyOhio is typically held on the last weekend of July, and we aim to give speakers as close as we can to two full months to prepare their talks. I've always found it helpful to back into a deadline, so that gives us roughly:

  • End of July: conference!
  • July 1: schedule announced
  • June 8: confirmed talk lineup announced
  • June 1: first round of acceptances delivered
  • May 15: CFP closed
  • Early to mid-March: CFP opened

Around the time the CFP opens, we start pushing communications around mentorship opportunities, both checking for folks who would like some guidance on getting a proposal put together as well as those willing to mentor newer speakers. As mentors and mentees sign up, I respond first with "thanks for signing up" emails, followed by emails to mentee/mentor pairs as I group them. That's all managed via Google docs: I have Google Forms set up for mentor and mentee intake, which both dump into Google Sheets spreadsheets. I use some conditional formatting to highlight mentors who mention PyOhio in their "past history" field, on the theory that if they've attended or spoken at PyOhio, they will be more familiar with the event and the vibe we try to establish, as well as folks who have mentored speakers in the past. I add fields to these to track the dates of all the points of contact such as acceptance and pairing, and I record the mentor/mentee pairings in each sheet as well. I also track whether a mentee has a talk accepted. We ask mentees to indicate the areas they are being mentored in, like submitting a talk proposal or preparing their talks, which I use to make the pair introduction email a little more meaningful. I should probably look into automating the email processes, but the traffic has been low enough that it's been okay to just keep the email templates in a Google Doc and do the copy/paste dance as needed.

It's then pretty quiet until we approach the close of the CFP; that's when the panic sets in as we see how few proposals have been submitted up to that point. Most of our proposals come in over the final week of the CFP, with about 50% of all proposals arriving in the last day. Folks do seem to enjoy procrastination! In 2018 we closed at midnight Eastern time; we changed this year use an "anywhere on earth" (or "AoE") cutoff and that proved to be really beneficial so I think we'll probably do that going forward.

Once the CFP closes, the rush to review talks is on. Our first phase relies on input from the community, soliciting volunteers via Twitter, email, and our Slack. We try to have a meaningful number of reviews on every talk within a week of the close of the CFP. During this phase we hide the speakers' names as well as any personally-identifiable data, and we deter speakers from rating their own talks by omitting them from lists of talks to be reviewed. (Yes, this last bit relies on the honor system to a degree; please don't be a jerk.) In 2018 we changed the default sort order of proposals to be reviewed so that the first ones surfaced are those with the fewest reviews; this has been a huge improvement in making sure all the proposals get the right amount of attention and helps to even out the volume of reviews for each one. Our current review process consists of rating a proposal ("++" for "strong proposal, and I will argue for it to be included", "+" for "okay proposal, but I won't argue for it", "0" for "abstain, don't show me again", "-" for "weak proposal, but I won't argue against it", and "--" for "problematic, and I will argue against its inclusion") and mandates a comment (which for me is often the hardest part of reviewing a proposal); all of this is conveniently part of the CFP site.

This period is also critical for speaker feedback, with reviewers messaging speakers through the CFP site to ask questions, request clarifications, or suggest improvements. Ideally we'd be doing this throughout the CFP process, but with the bulk of our proposals arriving in the last 24 hours, it isn't practical until we're into the post-CFP review phase.

During this phase, the three Program Committee folks review every proposal in this anonymized form, and we each make short lists of our "must have" talks as well as anything that we think is not going to be a good fit.

The second phase starts after we have a reasonable volume of reviews for all the talks and tutorials. We disable the community access to the review process and de-anonymize all the proposals. This is often where my heart breaks as I discover that three of my "must have" talk proposals are from the same speaker and I have to decide which of my darlings to throw overboard. Optimizing for diversity of voices means that we should only be taking one talk per speaker.

I build two more Google Sheets at this point, one to track talks and the other for tutorials. Into these sheets go dumps of the title, speaker, and voting data. I also add in the voting results specifically from the Program Committe members in separate columns. Columns are added for the "verdict" (whether we will accept a talk, reject it, or hold it for backup), community vote scoring, Program Committee scoring, whether the speaker had been part of the mentorship program, and any notes we want to add. I apply conditional formatting to highlight:

  • the entire row of a 45-minute proposal
  • the "new talk" field for talks that haven't been given elsewhere
  • the "first time" fields for primary or secondary speakers who haven't spoken before
  • the "diversity statement" field for speakers who self-reported being part of an underrepresented group
  • the "mentored" field
  • the "verdict" field

Each sheet also gets a separate tab for metadata that tracks, for each type of proposal (30-minute talk, 45-minute talk, tutorial):

  • the number of proposals
  • the number of slots available
  • the number and percentage of slots filled
  • the number and percentage of slots that still need to be filled
  • the number and percentage of proposals by first time speakers
  • the number and percentage of proposals by underrepresented speakers
  • the number and percentage of accepted proposals from first time speakers
  • the number and percentage of accepted proposals from underrepresented speakers

I use conditional formatting here as well to highlight certain things as various conditions are met; this helps me to know where things are good and where we have more work to do. Also I just love cell backgrounds lighting up magically as the process unfolds. ;-)

To generate a community vote score for each proposal, I started by giving each talk 3 points for each "++", 2 points for each "+", 1 point for a "-", and no points for a "--" or abstention. Since not all proposals received the same number of votes, however, I divide the summed score by the maximum theoretically possible score (3 x number of votes) so that we end up with a percentage. I apply the same point assignments for the commitee votes, but just use the basic sum since the committee members review every talk.

From here we start the sorting churn, first to separate the 45-minute and 30-minute talks, then to rank by the community and committee scores. The first sort lets us confirm how many slots we will plan to fill. For 2018 and 2019, we ended up with forty 30-minute slots, eight 45-minute slots, and four 2-hour tutorial slots. These numbers get plugged into those metadata sheets so that we can drive all the formatting and percentages.

I make a first pass through to record all of our "must have" and "must reject" results from our short lists. Anywhere that we have either more than one "must have" from the same speaker or a "must have" and a "must reject" in conflict, I mark the affected talks with a verdict of "discuss" (with its own magic highlighting for visibility!). Then I'll work with the committee, either on a Hangouts call if we can sync up or in Slack if we need to be asynchronous, to drive out discussion on everything else, redoing the sorting grind (now including verdict as the second column, so that all the accepted things are grouped together) as we make changes. Keeping this sorting going is annoying in Google Sheets because it isn't easy to reapply a complicated multi-column sort, but it's worth it to keep things grouped together since it limits the scope of the problem and really helps us focus. This part is mostly painless until we get down into the last few slots and have to make hard decisions. It's also important to decide between tutorial and talk proposals from the same speaker; this often depends on other proposals that might do a good job of supporting, leading into, or playing off of a particular talk or tutorial.

Besides the two scores, we're also looking carefully at proposals from first-time and underrepresented speakers, as PyOhio prides itself on expanding the diversity of voices that we're giving a platform to. We continue to focus here because, while we have been very pleased with the progress we have made so far, we have a long way to go before we can be satisfied.

We also look for any underappreciated gems lurking in the lower-scoring talks; there's often something weird, different, and wonderful hiding in there that's worth boosting up.

Hopefully, all of this second phase is completed in a few days, so that by the end of the second business week after the CFP has closed, we're able to get the thumbs-up from our conference chair and send out the first round of acceptance emails. Then my true agony begins: waiting for speakers to confirm that they will actually give the talks they proposed. As confirmations come in, I add a column to my sheets next to the "verdict" for the "outcome", marking whether the talk was "confirmed", "declined", or is "pending" based on things the speaker needs to check on. This year I also added a column to the spreadsheets to record the date that the acceptance went out so that I can easily flag things that have not had a response after a couple of days. For each response that comes in, I also send a reply email to ensure a good feedback loop with the speaker; I feel it's especially important to invite folks who had to decline the acceptance to submit again in the future.

As we have speakers decline, we dip back into our maybes for replacements, checking in on Slack to discuss options, and the process repeats until we've confirmed a full talk lineup. This loop takes a speaker's geographic origin into more consideration than the first pass, since someone who is local or from an adjacent state may not have the same travel challenges that might have prevented a more distant speaker from confirming an accepted talk. This is also a step where we shine a light on talks that might have gone underappreciated during the review process. (I should note here that we have been fortunate to have some really great talks amongst the "maybe" pile that have gone on to be very well received; these aren't "bad" talks by any stretch of the imagination, they just weren't quite in the first cut.) When the lineup is fully confirmed, I'm able to tend to the unfortunate duty of sending rejection notices. When that's complete, we can announced the lineup, hopefully by about a week after the first acceptances are sent.

Once the talk lineup is confirmed, it's time to make the actual schedule! I use a site called Padlet to help with this; it's a virtual sticky note board that's simple to use and can be operated collaboratively, so it's a great fit for us. It's really nice to explore the scheduling problem spatially and quickly swap things around. I start with making an empty schedule grid, with times down the left side, and rooms, with their seating capacities, across the top. Next I make cards for all of the talks and tutorials, grouped by size/duration. Each card gets the talk ID (so that I can quickly jump to the proposal details using a custom Alfred shortcut), an abbreviated title, the speaker name(s), the speaker's state or other geographical origin point, and optionally some single-character tokens to indicate whether the speaker is a first-timer or is underrepresented. I also color-code the card if I've made any promises to a speaker about scheduling (eg, if someone has requested a particular day or time based on travel needs) and also note that on the card.

Our current conference rooms vary signficantly in seating capacity, so gauging the prospective interest in each session is vital to getting talks in appropriately-sized rooms. To do this, I create a survey in Google Forms to allow the broader community to indicate their interest in each talk. This is basically two questions with "check all that apply" answers: one for 45-minute talks and one for 30-minute talks. I enable the random ordering for the response options in each question so that we can correct for any bias in the order they're presented in. This goes out at the same time that the talk lineup is announced, and we run the survey for about two weeks, by which point we've pretty much gotten all of the responses we're going to get. I add each talk's vote total to the card in Padlet so that I can use it as a factor in room assignment.

Once all the talk cards exist, and we've got the interest data, each member of the Program Committee makes a pass through them to group them thematically, so that we can understand where we have clusters of topics or things that we want to thread together to provide an extra level of meta-narrative for attendees.

We start with scheduling the tutorials first, since there are only four of them, and they're going into a single known room. I'd like to say we use a rigorous process here, but it mainly comes down to putting them into a sequence that doesn't feel too mentally overwhelming for us if we were to attend all four (respecting any promises noted above).

Talk scheduling is a bit more challenging. I start by placing anything where we had a commitment to a speaker in roughly the right time slot, knowing I can just scoot it into a different room as I get going. In a working area, I sort the cards into columns based on the number of survey votes they received and their topical grouping. This gives me a very approximate sense of which talk is headed to which room. Then it's off to races as I start dragging talks into place in my grid area to make the first draft of the full schedule!

There are many dimensions to consider here, and like it or not, the choices that are made reflect the values of the event and its organizers. Factors I consider include:

  • the surveyed interest in the talk
  • the speaker's geographic origin
  • the subject of the talk
  • the speaker's first-time and underrepresentation status
  • the audience level of the talk

I start with the highest-voted talks first, placing them into the two larger rooms, then the lowest-voted talks into our smallest room. Some exceptions are made where we feel strongly about the value of a talk that might have been overlooked during the survey process; often we can get a bigger audience for and underappreciated talk by simply programming it into a bigger room, signaling that it we feel it deserves to be there.

To accommodate our speakers' travel needs, the speaker's geographic origin plays a big role in choosing which block a talk goes into. We basically have four big sections: Saturday morning, Saturday afternoon, Sunday early afternoon, and Sunday late afternoon. That late Sunday afternoon block is typically all speakers from Ohio or the surrounding area for whom driving home wouldn't be super inconvenient. Speakers from places that have reasonable Sunday evening flight options can go into the Sunday early-afternoon block. Speakers with fewer flight options usually end up on Saturday, with anyone from the west coast going into Saturday afternoon to minimize their time zone pain.

The topic of the talk is important too; attendees interested in a particular subject will probably appreciate being able to attend more than one talk in that subject area; it doesn't make sense to have all the machine learning talks at the same time since someone could only attend one. So I try to layer those into the schedule vertically, sequencing them so we can try to wring some synergy from them; a good example here was arranging two talks about Pytest such that the more introductory one came first and the more in-depth one came second, and both of them preceeded a tutorial on automated web UI testing. (The speakers even picked up on this after the conference, which was utterly gratifying!) Talks on topics which we feel are important or should be emphasized (such as--but not limited to--testing, security, ethics, humane cultural practices) often get programmed into the bigger rooms. Talks on more niche subjects, personal projects, and curiousities often end up in one of the smaller rooms.

From a diversity standpoint, it's not enough to have a particular quantity of underrepresented or first-time speakers, they have to be visible too; that means scheduling them throughout the conference instead of loading them up at the same time slot(s). I approach scheduling with the goal of having at least one underrepresented speaker in every time slot. I similarly thread the first-time speakers vertically through the grid so that we always have a balance of first-time and veteran speakers.

We've chosen not to prompt speakers to identify the audience level of a talk, so I take a bit of a guess at slotting a beginner-appropriate talk into every time slot. This way the less-experienced attendees will always have at least one good option to explore throughout the entire conference. I've gotten positive feedback on this, so I think we're doing okay on this front.

Once I've got all the talks placed into the grid in Padlet, I export a png and share it with the committee. (I do this png step to give us an approximation of version history.) We go through a few rounds of review and adjustment, and I second guess myself a lot, but it stabilizes pretty quickly. Once the three of us feel good about it, I run it by the conference chair for approval, and after that I use the CFP site to assign every talk to a room and time.

The last thing we do as a committee is a new activity we added this year: we create a public Google Calendar for the conference events. The entire conference schedule is added to the calendar, and speakers are invited to their own talks. This way it's easy for anyone who likes to manage their conference-going using calendaring apps to pick talks they want to attend, and speakers can effortlessly know where and when they need to be to deliver their awesome talks. We got a lot of positive feedback about this this year, so we'll definitely keep doing it as long as I'm involved.

After that we sit back and wait to hear from speakers who have questions or need help with something. It's a lot of work, but it's all worth it when the conference happens and completely exceeds our expectations as it has these past two years. It's really gratifying when it all comes together and you can feel the energy from the speakers and attendees. I'm extremely thankful to my fellow committee members, the organizers, our speakers, and our attendees for the roles they play in creating this very special event.

Read and Post Comments

PyCon 2018 & 2019 Dates

IMG_1630.jpg

Since I've been asked a few times about the dates for the upcoming PyCons in my lovely city of Cleveland, Ohio, and there is surprisingly little about it in Google results, and I wouldn't mind an SEO bump, here is the scoop, along with photographic proof from the closing keynote of PyCon 2017:

As usual the first two days should be tutorials, followed by three days of conference proper, and finally four days of sprints.

I hope to see you there!

Read and Post Comments

Text Me Maybe: Smarter Real-World Integrations with Python

Gosh, it's been a year since I last posted! Let me try to make it up to you...

I took some existing talks on the road last year (to CodeMash, PyCon, and OSCON!) but I've once again put together something new for PyOhio.

So my family likes to know when I'm on the way home from work, but I'm lousy at remembering to text or call before I leave. Some basic "out of the box" geofencing solutions are available, but none of them are smart enough to understand situations like going to lunch where sending a "coming home" message wouldn't be appropriate. Luckily, we can assemble our own solution pretty quickly and cheaply with Python at the core, and we don't even have to run our own servers to do it!

In this talk I showed how I created a cloud-hosted, location-triggered SMS notification service with some decision-making smarts by combining IFTTT (If This Then That), AWS Lambda, Twilio, and just the right amount of Python.

The talk seemed to go really well, and I have been flattered and humbled by the volume of positive feedback I got about it. I hope it will inspire you to go have some fun making your smart things a little smarter.

Here are the slides:

Unfortunately there's no video due to a variety of AV issues, so you'll either need to use your imagination or convince the PyCon program committee to accept it for 2017. ;-)

And who knows, maybe I'll start posting more often (hahahaahhahaahahahahahahaha *wipes away tears* whoooo wow, who am I kidding?).

Read and Post Comments

Using Python to Get Out the Vote

After taking a year off from PyOhio due to a scheduling snafu (off-by-one errors apparently aren't just for software), I was delighted to be back this year, and with a fresh talk to boot.

This spring, I helped my wife with the data munging aspect of a school levy get-out-the-vote campaign. We mashed up school directory data with Ohio voter registration records to produce targeted contact lists for phone calls, mailings, and door-to-door visits, reducing redundant contacts and eliminating wasted contacts.

The technology involved is pretty straightforward, involving a little bit of Python and some pretty basic SQLAlchemy and Alembic (in fact, it was my first serious dive into both SQLAlchemy and Alembic).

The talk seemed to go pretty well, and I had some great conversations about it afterwards. Hopefully it will be inspiring or at least of some value to folks looking to do some useful things with Python.

Here are the slides:

And you can watch the video too.

Read and Post Comments

PyCon 2016 Dates

I blanked on the dates for PyCon 2016 the other day, and Google was strangely silent on the subject, so here, for your reference (and my SEO benefit), are the dates for PyCon 2016:

  • Tutorials: May 28–29, 2016
  • Conference: May 30–June 1, 2016
  • Sprints: Starting June 2, 2016

This means the tutorials will be over a weekend, and the conference will be during the week instead of the other way around, and it'll be a holiday weekend. I'm looking forward to finding out what this does to the dynamic of the conference.

Hopefully I'll see you there--if I can remember the dates, that is.

Read and Post Comments

Announcing Procatindex

Replacements

If you're even a little bit like me, you think Procatinator is one of the Internet's greatest achievements. (If you don't know Procatinator, pop on over there for a minute or two and you'll know whether the rest of this post is for you or not.)

If, like me, you have favorite cat GIF/music mashups but can't recall their exact URLs when you're trying to wow your friends, then my latest silly website project is for you.

Behold: the Procatindex!!

Procatindex.com keeps a list of all the Procatinator cats, with titles pulled from the music videos used. The list is automatically refreshed when there's a new cat, and you can subscribe to the RSS feed to make sure you never miss the latest additions.

The site and the script that refreshes it were built in a couple of hours with Requests and Flask, which made short work of the task. (If you aren't familiar with these, you should check them out. Though I have mixed feelings about Flask, it's a wonderful go-to for quick web apps like this. And Requests has become something I can no longer live without.)

Hopefully this improves the effectiveness of your procatination. Enjoy!

Read and Post Comments

Shiny, Let's Be Bad Guys

A couple of weeks ago at the amazing-beyond-belief PyCon 2013, David Stanek and I presented a half-day tutorial. We used a deliberately-vulnerable web application to walk our students through the OWASP Top 10, giving them hands-on experience exploiting these problems and offering advice on how to mitigate them.

While we had concerns about the amount of material and the time available, not to mention the size of the class--we had about 80 people show up!--it seemed to go well, and we got a lot of positive feedback both during the tutorial itself and throughout the rest of the conference. One attendee even told us that thanks to our class, he'd fixed a security problem over lunch immediately after the tutorial! It was immensely satisfying to hear that we'd been able to catalyze some actual improvement in the world.

If the official feedback is good enough, we may look to run this again in the future, whether at smaller venues like PyOhio or next spring at PyCon 2014.

You can clone down the tutorial app if you'd like to follow along with the slides.

Read and Post Comments

Web Development with Python and Django

I had the honor of working with Mike Crute and David Stanek to produce and deliver an all-day tutorial session at CodeMash 2013, where we got folks up to speed on Python and then ran them through a series of iterative exercises as we built a small Django site together.

We promised slides, and though we took a bit of a break to celebrate and then enjoy the conference, I wanted to make sure we didn't wait too long before making them available. Hopefully they will be a useful reference in spite of their lack of the interactivity inherent in a live tutorial session.

You can clone down the sample code repository if you'd like to play along at home.

I think it's safe to say we had a great time presenting at and attending CodeMash and are looking forward to continuing to make sure Python is represented there.

Read and Post Comments

210/365: PyOhio

210/365: PyOhio

Five years after creating a logo for it, I finally managed to attend PyOhio, an awesome and free community conference in Columbus, Ohio. It was a treat to catch up with friends from PyCon as well as meet lots of new people, and I'm really excited to see how PyOhio has grown over the years--it's now almost as big as PyCon was when I first started attending it. From great talks to a fun hallway track, sprinting, and socializing, it's clear that there's something great going on here, and I'm really excited to return.

Read and Post Comments

A 365 Project for Code?

Code!

While Twittering in jest over the frequency of Requests' releases, Jeff Forcier inspired me to wonder if you could do the software equivalent of the photography 365 projects that seem to be all the rage this year. I started probing around at the edges of the idea, and it seemed like it could actually be an interesting challenge.

The closest thing I could find (which Jeff pointed me to) is Calendar About Nothing, which tracks your public GitHub commits and gives you a lovely red X for each day you push up something that's available to the public. That's really cool, but it's also pretty easy to just be a weasel and rack up an epic streak by twiddling one's dotfiles back and forth every day. You could script it.

So what would a "code 365" look like? Extrapolating from photography, I would say that it's basically:

  1. Write something unique every day -- as humble or as wild and involved as you like, but something new every day. You wouldn't just publish small tweaks to a photo to get extra days of a photography 365, so once your daily code is released, it's done. And while you might be inspired to emulate great photographers or programmers who came before you--and that's okay--it's not okay to just grab somebody else's FOSS and claim it as your own. The challenge is for you to produce something new. (I'm willing to make an exception for "Hello World" entries, especially if they're on day 1, because it's the sort of thing I would do.)

  2. Any language you want -- just as a photographic 365 invites you to try different tools, techniques, and subjects, if you're brave enough to do a code 365, why not experiment with different languages, programming paradigms, and platforms? We often talk a good, smug game about the value of polyglot programmers, but why not prove it to yourself?

  3. Release it -- push it up to GitHub or whatever your preferred platform is; get it out there where we can see it and admire your foolhardy audacity.

I guess the next question is... Who would be crazy enough to do one of these? People like Corey Haines or Gary Bernhardt come to mind. I bet Kenneth Reitz could do it, but then we'd miss out on updates to the ten million other things of his that are increasingly hard to do without.

I'd be tempted to do it--what's good for the goose is good for the gander, right?--but I'm definitely not willing to consider in 2012. I think one 365 project is more than enough!

What do you think? Is this silly? Or could this Be A Thing?

Read and Post Comments

Next Page »