Per the subject-line ... bitcoin.messages.msg_headers.msg_deser deserializes each header with CBlockHeader.stream_deserialize, which reads exactly 80 bytes. But the P2P 'headers' message wire format appends a 0-byte transaction-count varint after each 80-byte header (Bitcoin Core writes blocks-without-transactions in headers messages, so each entry has the standard CBlock framing including the tx-count varint). The result is that after deserializing header[0], the stream is 1 byte ahead of where the deserializer thinks it is, and header[1] onwards are shifted by 1 byte (or more, if a future protocol upgrade uses a larger varint).
The bug doesn't show up in self-round-trip tests because msg_headers.msg_ser doesn't write the trailing byte either, so serialize-then-deserialize is symmetric. It only surfaces against on-wire data from a real peer.
Steps to reproduce
import socket
from bitcoin.messages import (
MsgSerializable, msg_version, msg_verack,
msg_getheaders, msg_headers, msg_ping, msg_pong,
)
sock = socket.create_connection(('seed.bitcoin.sipa.be', 8333), timeout=10)
f = sock.makefile('rb')
# Standard version/verack handshake.
sock.sendall(msg_version().to_bytes())
for _ in range(10):
m = MsgSerializable.stream_deserialize(f)
if isinstance(m, msg_version):
sock.sendall(msg_verack().to_bytes())
elif isinstance(m, msg_verack):
break
# Ask for headers starting from genesis.
gh = msg_getheaders()
gh.locator.vHave = [b'\x00' * 32]
sock.sendall(gh.to_bytes())
while True:
m = MsgSerializable.stream_deserialize(f)
if isinstance(m, msg_ping):
pong = msg_pong(); pong.nonce = m.nonce
sock.sendall(pong.to_bytes())
continue
if isinstance(m, msg_headers):
hs = m.headers
print("header[0] hash:", hs[0].GetHash()[::-1].hex())
print("header[1] prev:", hs[1].hashPrevBlock[::-1].hex())
print("chain OK? ", hs[1].hashPrevBlock == hs[0].GetHash())
break
Observed output (against any mainnet peer):
header[0] hash: 000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
header[1] prev: 000000839a8e6886ab5951d76f411475428afc90947ee320161bbf18eb604800
chain OK? False
Note header[1].hashPrevBlock is header[0].GetHash() shifted left by one byte.
Suggested fix
In bitcoin/messages.py:
class msg_headers(MsgSerializable):
@classmethod
def msg_deser(cls, f, protover=PROTO_VERSION):
c = cls()
n = VarIntSerializer.stream_deserialize(f)
for _ in range(n):
header = CBlockHeader.stream_deserialize(f)
VarIntSerializer.stream_deserialize(f) # discard tx-count varint (always 0)
c.headers.append(header)
return c
def msg_ser(self, f):
VarIntSerializer.stream_serialize(len(self.headers), f)
for h in self.headers:
h.stream_serialize(f)
VarIntSerializer.stream_serialize(0, f) # tx-count varint
This makes the deserializer match the on-wire format, and the serializer continues to round-trip with the new deserializer.
Some context...
I hit this implementing Bitcoin P2P getheaders SPV-style header fetching for opentimestamps-client (also yours). Current workaround there is to intercept the 'headers' command at the framing layer and parse the body manually. Happy to open a PR with the fix above plus a regression test that round-trips a wire-formatted body.
Per the subject-line ...
bitcoin.messages.msg_headers.msg_deserdeserializes each header withCBlockHeader.stream_deserialize, which reads exactly 80 bytes. But the P2P 'headers' message wire format appends a 0-byte transaction-count varint after each 80-byte header (Bitcoin Core writes blocks-without-transactions inheadersmessages, so each entry has the standard CBlock framing including the tx-count varint). The result is that after deserializing header[0], the stream is 1 byte ahead of where the deserializer thinks it is, and header[1] onwards are shifted by 1 byte (or more, if a future protocol upgrade uses a larger varint).The bug doesn't show up in self-round-trip tests because
msg_headers.msg_serdoesn't write the trailing byte either, so serialize-then-deserialize is symmetric. It only surfaces against on-wire data from a real peer.Steps to reproduce
Observed output (against any mainnet peer):
Note
header[1].hashPrevBlockisheader[0].GetHash()shifted left by one byte.Suggested fix
In
bitcoin/messages.py:This makes the deserializer match the on-wire format, and the serializer continues to round-trip with the new deserializer.
Some context...
I hit this implementing Bitcoin P2P getheaders SPV-style header fetching for
opentimestamps-client(also yours). Current workaround there is to intercept the 'headers' command at the framing layer and parse the body manually. Happy to open a PR with the fix above plus a regression test that round-trips a wire-formatted body.