Skip to content

Segmentation fault on EXPLAIN of a query with LEFT JOIN to a distributed table and correlated subqueries in Citus 14.0.0 (PostgreSQL 18.1) #8548

@duerwuyi

Description

@duerwuyi

Description

I found a reproducible backend crash in Citus 14.0.0.

On my setup, EXPLAIN of the query below terminates the client backend with signal 11: Segmentation fault, which then forces PostgreSQL recovery. The same testcase does not reproduce on Citus 13.0.2.

The crash happens during EXPLAIN, so this appears to be in planning / distributed planning rather than execution.

Environment

  • PostgreSQL 18.1
  • Citus 14.0.0

select version();
PostgreSQL 18.1 (Debian 18.1-1.pgdg13+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 14.2.0-19) 14.2.0, 64-bit

select citus_version();
Citus 14.0.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 14.2.0-19) 14.2.0, 64-bit

Version comparison

  • Reproduces on: Citus 14.0.0
  • Does not reproduce on: Citus 13.0.2

Minimal schema

Only t22 is distributed, and it is distributed by colocated_key.
The other tables are regular local tables.

DROP TABLE IF EXISTS t4 CASCADE;
DROP TABLE IF EXISTS t5 CASCADE;
DROP TABLE IF EXISTS t7 CASCADE;
DROP TABLE IF EXISTS t2 CASCADE;
DROP TABLE IF EXISTS t22 CASCADE;

CREATE TABLE t4 (
    vkey integer,
    pkey integer,
    c30 integer,
    c31 integer,
    c32 text
);

CREATE TABLE t5 (
    vkey integer,
    pkey integer,
    c33 text,
    c34 integer,
    c35 integer,
    c36 timestamp without time zone
);

CREATE TABLE t7 (
    vkey integer,
    pkey integer,
    c45 integer,
    c46 integer,
    c47 integer,
    c48 numeric,
    c49 integer
);

CREATE TABLE t2 (
    vkey integer,
    pkey integer,
    c15 numeric,
    c16 timestamp without time zone,
    c17 text,
    c18 text,
    c19 timestamp without time zone,
    c20 timestamp without time zone,
    c21 integer
);

CREATE TABLE t22 (
    vkey integer,
    pkey integer,
    c37 numeric,
    c38 text,
    c39 numeric,
    c40 numeric,
    c41 numeric,
    c42 integer,
    c43 timestamp without time zone,
    c44 numeric,
    colocated_key numeric
);

SELECT create_distributed_table('t22', 'colocated_key');

## Query to reproduce

```sql
select
  70 as c_0
from
  (
    select
      (
        exists (
          select
            ref_5.c33 as c_0
          from
            t5 as ref_5
          where
            (make_timestamp(2001, 7, 13, 17, 53, 31)) = (ref_1.c43)
        )
      ) as c_0
    from
      (
        t4 as ref_0
        left outer join t22 as ref_1
          on (ref_0.vkey = ref_1.vkey)
      )
    where
      (
        (ref_0.c31) >= (
          select
            ref_1.pkey as c_0
          from
            t2 as ref_2
          where
            (true) < ((ref_2.c17) ^@ (ref_0.c32))
          order by
            c_0 desc
          limit 1
        )
      ) in (
        select
          (ref_1.c40) <= (ref_1.c37) as c_0
        from
          t7 as ref_4
        where
          not ((ref_1.c40) <> (ref_1.c41))
      )
  ) as subq_0
where
  (TRUE) < (TRUE);

log

LOG:  client backend (PID 114) was terminated by signal 11: Segmentation fault
DETAIL:  Failed process was running: EXPLAIN select ...
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted; automatic recovery in progress

Expected behavior

EXPLAIN should either:

successfully produce a plan, or
return a normal SQL/planning error if this query shape is unsupported.

It should not crash the backend.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions