The Stack Overflow Survey Results for 2019 are in! There is some official analysis, that mentioned some things that mattered to me, and some that did not. I decided to dig into the data and see if I can find some things that would potentially interest my readership.

import csv, collections, itertools
with open("survey_results_public.csv") as fpin:
    reader = csv.DictReader(fpin)
    responses = list(reader)
len(responses)
88883

Wow, almost 90K respondents! This is the sweet spots of "enough to make meaningful generalizations" while being able to analyze with rudimentary tools, not big-data-ware.

pythonistas = [x for x in responses if 'Python' in x['LanguageWorkedWith']]
len(pythonistas)/len(responses)
0.41001091322300104

About 40% of the respondents use Python in some capacity. That is pretty cool! This is one of the things where I wonder if there is bias in the source data. Are people who use Stack Overflow, or respond to surveys for SO, more likely to be the kind of person who uses Python? Or less?

In any case, I am excited! This means my favorite language, for all its issues, is doing well. This is also a good reminder that we need to think about the consequences of our decisions on a big swath of developers we will never ever meet.

opensource = collections.Counter(x['OpenSourcer'] for x in pythonistas)
sorted(opensource.items(), key=lambda x:x[1], reverse=True)
[('Never', 11310),
 ('Less than once per year', 10374),
 ('Less than once a month but more than once per year', 9572),
 ('Once a month or more often', 5187)]
opensource['Once a month or more often']/len(pythonistas)
0.1423318607139917

Python is open source. Almost all important libraries (Django, Pandas, PyTorch, requests) are open source. Many important tools (Jupyter) are open source. The number of people who contribute to them with any kind of regular cadence is less than 15%.

general_opensource = collections.Counter(x['OpenSourcer'] for x in responses)
sorted(general_opensource.items(), key=lambda x:x[1], reverse=True)
[('Never', 32295),
 ('Less than once per year', 24972),
 ('Less than once a month but more than once per year', 20561),
 ('Once a month or more often', 11055)]

The Python community does compare well to the general populace, though!

devtype = collections.Counter(itertools.chain.from_iterable(x["DevType"].split(";") for x in pythonistas))
devtype['DevOps specialist']/len(responses)
0.052282213696657406

About 5% of total respondents are my peers: using Python for DevOps. That is pretty exciting! My interest in that is not merely theoretical, my upcoming book targets that crowd.

general_devtype = collections.Counter(itertools.chain.from_iterable(x["DevType"].split(";") for x in responses))
general_devtype['DevOps specialist']/len(responses), devtype['DevOps specialist']/len(pythonistas)
(0.09970410539698255, 0.12751420025793705)

In general, DevOps specialists are 10% of respondents.

devtype['DevOps specialist']/general_devtype['DevOps specialist']
0.524373730534868

Over 50% of DevOps specialists use Python!

def safe_int(x):
    try:
        return int(x)
    except ValueError:
        return -1

intermediate = sum(1 for x in pythonistas if 1<=safe_int(x['YearsCode'])<=5)

My next hush-hush (for now!) project is going to be targeting intermediate Python developers. I wish I could slice by "number of years writing in Python, but this is the best I could do. (I treat "NA" responses as "not intermediate". This is OK, since I prefer to underestimate rather than overestimate.)

intermediate/len(responses)
0.11346376697456206

11%! Not bad.

general_intermediate = sum(1 for x in responses if 1<=safe_int(x['YearsCode'])<=5)
intermediate/len(pythonistas), general_intermediate/len(responses)
(0.27673352907279863, 0.2671264471271222)

Seems like using Python does not change much the chances of someone being intermediate.

Summary

  • 40% of respondents use Python. Python is kind of a big deal.
  • 5% of respondents use Python for DevOps. This is a lot! DevOps as a profession is less than 10 years old.
  • 11% of respondents are intermediate Python users. My previous book targets this crowd.

(Thanks to Robert Collins and Matthew Broberg for their comments on an earlier draft. Any remaining issues are purely my responsibility.)


Inbox Zero

Wed 15 May 2019 by Moshe Zadka

I am the parent of two young kids. It is easy to sink into random stuff, and not follow up on goals. Strict time management and prioritization means I get to work on open source projects, write programming books and update my blog with a decent cadence. Since a lot …

read more

Publishing a Book with Sphinx

Mon 08 April 2019 by Moshe Zadka

A while ago, I decided I wanted to self-publish a book on improving your Python skills. It was supposed to be short, sweet, and fairly inexpensive.

The journey was a success, but had some interesting twists along the way.

From the beginning, I knew what technology I wanted to write …

read more

A Local LRU Cache

Fri 29 March 2019 by Moshe Zadka

"It is a truth universally acknowledged, that a shared state in possession of mutability, must be in want of a bug." -- with apologies to Jane Austen

As Ms. Austen, and Henrik Eichenhardt, taught us, shared mutable state is the root of all evil.

Yet, the official documentation of functools tells …

read more

Don't Make It Callable

Wed 13 February 2019 by Moshe Zadka

There is a lot of code that overloads the __call__ method. This is the method that "calling" an object activates: something(x, y, z) will call something.__call__(x, y, z) if something is a member of a Python-defined class.

At first, like every operator overload, this seems like a …

read more

Staying Safe with Open Source

Thu 24 January 2019 by Moshe Zadka

A couple of months ago, a successful attack against the Node ecosystem resulted in stealing an undisclosed amount of bitcoins from CoPay wallets.

The technical flow of the attack is well-summarized by the NPM blog post. Quick summary:

  1. nodemon, a popular way to run Node applications, depends on event-stream.
  2. The …
read more

Checking in JSON

Tue 08 January 2019 by Moshe Zadka

JSON is a useful format. It might not be ideal for hand-editing, but it does have the benefit that it can be hand-edited, and it is easy enough to manipulate programmatically.

For this reason, it is likely that at some point or another, checking in a JSON file into your …

read more

Office Hours

Sat 08 December 2018 by Moshe Zadka

If you want to speak to me, 1-on-1, about anything, I want to be able to help. I am a busy person. I have commitments. But I will make the time to talk to you.

Why?

  • I want to help.
  • I think I'll enjoy it. I like talking to people …
read more

Common Mistakes about Generational Garbage Collection

Wed 28 November 2018 by Moshe Zadka

(Thanks to Nelson Elhage and Saivickna Raveendran for their feedback on earlier drafts. All mistakes that remain are mine.)

When talking about garbage collection, the notion of "generational collection" comes up. The usual motivation given for generational garbage collection is that "most objects die young". Therefore, we put the objects …

read more

The Conference That Was Almost Called "Pythaluma"

Wed 07 November 2018 by Moshe Zadka

As my friend Thursday said in her excellent talk (sadly, not up as of this time) naming things is important. Avoiding in-jokes is, in general, a good idea.

It is with mixed feelings, therefore, that my pun-loving heart reacted to Chris's disclosure that the most common suggestion was to call …

read more