Test precedence of columns in conflict with *#685
Conversation
|
rkistner
left a comment
There was a problem hiding this comment.
I think the current sync stream behavior makes sense - we should keep as-is. It is unfortunate that it's different from legacy sync rules, but I think changing it again in sync streams would make things even more confusing.
This is an interesting case, since there isn't very consistent specs we can rely on for this. This case is not covered directly by SQL, since it would just return multiple columns with the same name. But typical db drivers that convert to an object/map would preserve the last column if there are multiple with the same name.
| Source | Behavior |
|---|---|
| node:sqlite | preserve last |
| better-sqlite3 | preserve last |
JS JSON.parse('{"id": 1, "id": 2}') |
preserve last |
SQLite select '{"id": 1, "id": 2}' ->> 'id' |
preserve first |
SQLite select id from (select 1 as id, 2 as id) |
preserve first |
Postgres select id from (select 1 as id, 2 as id) |
ERROR: column reference "id" is ambiguous |
So there isn't good consensus, but I think "preserve last" makes the most sense.
I do think we should update our docs. We have a couple of examples such as SELECT _id as id, * FROM ..., which should rather have the wildcard first. Example.
Test script for `node:sqlite` and `better-sqlite3`
import { DatabaseSync } from 'node:sqlite';
import Database from 'better-sqlite3';
const query1 = `SELECT *, 3 as a FROM (SELECT 1 as a, 2 as b)`;
const query2 = `SELECT 3 as a, * FROM (SELECT 1 as a, 2 as b)`;
console.log('node:sqlite:');
const db = new DatabaseSync(':memory:');
console.log(db.prepare(query1).get());
console.log(db.prepare(query2).get());
console.log('better-sqlite3:')
const db2 = new Database(':memory:');
console.log(db2.prepare(query1).get());
console.log(db2.prepare(query2).get());Output:
node:sqlite:
[Object: null prototype] { a: 3, b: 2 }
[Object: null prototype] { a: 1, b: 2 }
better-sqlite3:
{ a: 3, b: 2 }
{ a: 1, b: 2 }
When combining
*result columns with explicit aliases, it's possible to end up with multiple columns having the same name. Since we sync rows as JS objects, one of those columns would then be excluded from synced data.In the old Sync Rules system, explicit aliases would always take precedence over
*columns. That is,SELECT x AS a, * FROM tblsyncs source row{x: 'x', a: 'a'}as{x: 'x', a: 'x'}.With Sync Streams, the last column wins.
SELECT x AS a, *gives you{x: 'x', a: 'a'},SELECT *, x AS aresults in{x: 'x', a: 'x'}. I don't think this was an intentional change, but it's kind of convenient to have control over predecence in such cases.This adds test verifying the current behavior for Sync Streams. A question is whether we should revert that to the old behavior for backwards compatibility. If not, we can instead change the migration tool to move
*to the front, ensuring explicit aliases take precedence.For more context, see internal ticket 40171.