Skip to content

Investigate string decode performance improvements #4

@andrewthad

Description

@andrewthad

In 665c428, I've added a benchmark that focuses on the performance of decoding strings. Here's the performance of decoding 4KB of json, most of which is just 30-character strings:

benchmarked json/url/100/decode
time                 8.053 μs   (7.736 μs .. 8.411 μs)
                     0.996 R²   (0.989 R² .. 0.999 R²)
mean                 8.089 μs   (8.051 μs .. 8.139 μs)
std dev              74.41 ns   (48.48 ns .. 107.8 ns)

benchmarked aeson/url/100/decode
time                 22.66 μs   (21.50 μs .. 24.51 μs)
                     0.994 R²   (0.989 R² .. 0.999 R²)
mean                 22.72 μs   (22.49 μs .. 22.84 μs)
std dev              276.1 ns   (159.0 ns .. 414.7 ns)

Not bad. We're ahead of aeson by a factor of three, but can this number be improved further? String decode currently walks the string, byte-by-byte, until it finds the end of it. As it walks the string, it keeps up with information about whether or not anything will need to be unescaped. I think that it should be possible to instead walk the string w64-by-w64. This could be done by adapting the approach in bytestring-encodings to work with ByteArray instead of Ptr and adding some additional bit twiddling hacks. The general idea would be:

  • Fail if you encounter a backslash
  • Fail if you encounter a byte less than 0x20
  • Fail if you encounter a byte greater than 0x7E
  • Fail if your read would give you a w64 that straddled the end of the string (simplifies things a little)
  • Succeed if you encounter a "

Failure just means to fall back to the existing string decode logic, and succeed means that we may perform a memcpy (as we do now). This whole thing is a little bit tricky because it's possible to encounter both a " and a failing byte in them same w64, and then the order that they showed up in matters. But I think that the conservative action of always failing even if the quote showed up first is probably the best course. It simplifies it, and the bytes after a string ends are probably ascii characters anyway, so this shouldn't cost a ton of performance.

This needs to be implemented and benchmarked, but I think it could this benchmark at least 2x faster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions