Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why Python, Ruby, and Javascript are Slow

Alex Gaynor
February 28, 2013

Why Python, Ruby, and Javascript are Slow

Talk as delivered at Waza, 2013 in San Francisco.

Alex Gaynor

February 28, 2013
Tweet

More Decks by Alex Gaynor

Other Decks in Programming

Transcript

  1. Why Python, Ruby, and Javascript are slow Alex Gaynor Waza

    2013 Thursday, February 28, 13 Hi everyone, thanks for coming here on such short notice, I know there’s a lot of way more talented speakers you could be seeing right now.
  2. You may know me from... Thursday, February 28, 13 So,

    first who am I. Here are some places you may have seen me.
  3. rdio.com Thursday, February 28, 13 I work for a company

    called rdio, we do streaming internet music. We’re awesome, we’re hiring, even if you don’t want a job you should totally use our product.
  4. • CPython • Django • PyPy Thursday, February 28, 13

    I also write a lot of open source. I write primarily in Python, web stuff, compilers, I like infrastructure type projects.
  5. Twitter rants about how computers are bad Thursday, February 28,

    13 If you have the misfortune to follow me on twitter, I tweet about how computers are bad, software is bad, and we should all feel bad.
  6. Topaz topazruby.com Thursday, February 28, 13 And recently I open

    source topaz. A high performance implementation of Ruby on top of RPython, which is the toolchain that powers PyPy. It’s pretty sweet, check it out and contribute!
  7. There is no benchmark but your benchmark Thursday, February 28,

    13 If you’ve read PyPy is X times faster than CPython. Or C is Y times faster than Java or any other number like that. IGNORE THEM. These are myths, lies, and inappropriate attempts to use statistics.
  8. Lame Excuses about why they’re Slow Thursday, February 28, 13

    So we’re going to talk about why a bunch of dynamic languages (and yes this basically applies to all of them, and also a bunch of statically typed languages) are slow. First let’s look at the common wisdom about why they’re slow and see where that gets us (hint: this slide is telling you).
  9. Dynamic Typing Thursday, February 28, 13 So first bit of

    common wisdom: dynamic typing, it makes stuff slow because the compiler can’t optimize stuff based on types. This is complete and utter bollocks. Dynamic typing is not a problem. We have good ways to solve it, like tracing JITs, method JITs with runtime type feedback, and predictive type infrencers.
  10. “You can monkey patch anything” Thursday, February 28, 13 Again,

    not a problem. When you’re building a JIT, it’s basically a solved problem, it’s called “deoptimization” and basically that means, generate really optimized code, but still check your assumptions, if they’re wrong you bail out. Most advanced JITs, like RPython (as seen in PyPy and Topaz), and v8 have this.
  11. Harder to Optimize vs. Slow Thursday, February 28, 13 It’s

    harder to write a good VM for these languages because of those features. It’s hard to make reading an attribute in Python as fast as reading a field out of a struct in C. But we’ve done that so now we’re going to talk about why stuff is REALLY slow.
  12. The Truth Thursday, February 28, 13 In truth, there are

    two things that can cause code to be slow: data structures and algorithms. Those are of course so broad, so general that they could apply to anything. But let’s start with a core point: I can run a line of Python or Ruby as fast as you can run a line of C or Java. It’s all about what that line does.
  13. Let’s talk about C Thursday, February 28, 13 So C

    is fast, it’s basically a DSL for assembly, right? So C is fast because it’s basically like x86 instructions. But comparing what we do in it with what we do in higher level languages can make stuff clear.
  14. struct Point { double x; double y; double z; };

    Thursday, February 28, 13 This is defining a Point struct in C. It’s got 3 fields, they’re all doubles.
  15. class Point(object): def __init__(self, x, y, z): self.x = x

    self.y = y self.z = z Thursday, February 28, 13 This is a reasonable Python version of that. Some obvious differences: * no type declarations * you can add any other field you want to point instances * point instances are always allocated by the GC on the heap NONE OF THAT MATTERS, I WRITE COMPILERS I’LL DEAL WITH THAT
  16. data = { "x" x, "y": y, "z": z, }

    Thursday, February 28, 13 Here’s what I see people do a lot. This is a dictionary. Dictionaries are not objects, dictionaries are hash tables.
  17. Dictionary vs. Object Thursday, February 28, 13 A dictionary is

    a mapping of arbitrary keys (usually of homogenous type) to arbirary values. An object is thing with a fixed set of properties and behaviors. You should never confuse them even if the performance difference didn’t exist, these are different things.
  18. std::hash_set<std::string, double> point; point["x"] = x; point["y"] = y; point["z"]

    = z; Thursday, February 28, 13 I wanted to use a pure C hash table implementation, but googling “C hash table” didn’t bring up useful stuff. Anyways, if you put this in a code review, first your coworkers would laugh a lot, and then they’d make mean jokes about you for the next year. It looks ridiculous, and it is ridiculous.
  19. And it would be slow Thursday, February 28, 13 If

    you mixed up hash tables and objects in C or the like, it would be idiotically slow.
  20. Why don’t people care? Thursday, February 28, 13 On naive

    implementations of these languages, that is CPython, MRI, and most JS implementations up until 5 years ago. There’s no performance difference. When you have a real JIT though, the difference is huge. They train themselves to think of a dictionary as a lightweight object as a weird dogma. And in Javascript it’s all one type so that’s a mess anyways.
  21. Let’s talk about strings Thursday, February 28, 13 Strings are

    probably the most used datastructure in all of programming. You use them EVERYWHERE. It’s pretty rare to have a single function that doesn’t have a string in it.
  22. Given a string matching: “\w+-\d+” return the integral part of

    the value Thursday, February 28, 13 This type of problem is a pretty common practical parsing task.
  23. int(s.split("-", 1)[1]) Thursday, February 28, 13 This is probably how

    like 99% of python people would solve it. Split the string on the dash, get the second part, convert to an int. Let’s talk about efficiency. split is going to allocate a list of 2 elements, allocate 2 strings, do 2 copies, and then convert one to an int, throwing the rest away.
  24. atoi(strchr(s, '-') + 1) Thursday, February 28, 13 What does

    this do? Finds the first instance of a -, and converts the remainder of a string to an int. 0 allocations, 0 copies. Doing this with 0 copies is pretty much impossible in Python, and probably in ruby and Javascript too.
  25. Things that take time • Hash table lookups • Allocations

    • Copying Thursday, February 28, 13 The JIT will take care of the accidental ones, but the ones that are fundamental part of your algorithm, you need to deal with those.
  26. The C way: BYOB Thursday, February 28, 13 Bring Your

    Own Buffer. In, C no stdlib function allocates a string basically, you’re always responsible for bringing a buffer to allocate data into. In practice this means C programmers tends to work on a single allocated block of memory.
  27. char *data = malloc(1024); while (true) { read(fd, data, 1024);

    char *start = data; while (start < data + 1024) { if (isspace(*start)) { break; } start++; } printf("%s\n", start); } Thursday, February 28, 13 A C funciton that reads data from a file descriptor, and then prints each chunk with the leading spaces removed. It does one allocation the whole time.
  28. while True: data = os.read(fd, 1024) print data.lstrip() Thursday, February

    28, 13 The Python version is much prettier, much shorter, and much slower. It does multiple allocations per iteration of the loop. We can easily imagine if we wanted to add a .lower() to print the in lowercase, now there’s an extra allocation and an extra copy in the loop, whereas the C one could easily add that with no additional copying.
  29. long *squares(long n) { long *sq = malloc(sizeof(long) * n);

    for (long i = 0; i < n; i++) { sq[i] = i * i; } return sq; } Thursday, February 28, 13 This is a basically idiomatic C function to return an array of squares in C. One allocation, no copying.
  30. def squares(n): sq = [] for i in xrange(n): sq.append(i

    * i) return sq Thursday, February 28, 13 A basically idiomatic version of the same in Python. No list pre-allocation, so every iteration through the list we have the potential to need to resize the list and copy all the data. That’s inefficient.
  31. Missing APIs Thursday, February 28, 13 Python, Ruby, Javascript, it’s

    not that there’s any necessary reason for these inefficiency. It’s that we’re missing APIs, and we could have GREAT APIs for this. PyPy has some secret ones:
  32. from __pypy__ import newlist_hint def squares(n): sq = newlist_hint(n) for

    i in xrange(n): sq.append(i * i) return sq Thursday, February 28, 13 newlist_hint returns you a list that looks totally normal, but internally it’s been preallocated. For building large lists this is way faster, with 0 loss of power, 0 loss of flexibility.
  33. Don’t make us add heuristics Thursday, February 28, 13 As

    a compile author I’m begging you: don’t make dynamic languages compilers add crazy heuristics. We need good APIs in Python, Ruby, and Javascript for preallocating data, for not copying stuff left and right. And we can build them, it’s way easier to make a kickass API for something in any of these languages than C.
  34. Heuristics = WAG Thursday, February 28, 13 WAG = Wild

    Ass Guess. And that’s what heuristics devolve to. But if we, as a community, don’t embrace a set of idioms for high performance Python, then as a compiler author this is what I’ll end up writing, and it’ll make me sad.
  35. Growing divide between optimizing and not Thursday, February 28, 13

    Code that is optimal for PyPy is not for CPython, and this divide is growing wider as we add more optimizations, and the same is true of Ruby and Javascript VMs. As a dynamic language community we need to get our act together on this front.
  36. Recap • Line for line these languages are fast! •

    Take care in data structures (data structure heuristics are the WORST) • We need better no-copy/preallocate APIs Thursday, February 28, 13
  37. Thank you! https://speakerdeck.com/alex @alex_gaynor Thursday, February 28, 13 I’m told

    there’s no official Q/A so please come find me in the halls and bug me, I’m super excited to talk about this stuff, PyPy, Topaz, web application architecture, pretty much anything.
  38. If there’s time • Java collections vs. Array and Hash.

    Need more choices. • Stop writing C extensions, use something like cffi • Teach good benchmarking practices Thursday, February 28, 13