Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/doltlite-patches/eqSeen-preservation.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
Preserve pIdxKey->eqSeen across findMatchingMutMapEntry

findMatchingMutMapEntry's internal sqlite3VdbeRecordCompare probes reset
and then re-set pIdxKey->eqSeen on every iteration, clobbering the flag
that the tree-scan loop set to signal an exact prefix match. OP_SeekGE
with BTREE_SEEK_EQ relies on that eqSeen flag to decide whether an
exact match was found, so without this save/restore the second same-
table DELETE inside a SAVEPOINT silently bails to seek_not_found and
the DELETE is lost.

--- a/src/prolly_btree.c
+++ b/src/prolly_btree.c
@@ -4445,6 +4445,14 @@
|| (pPending && pPending!=pCur->pMutMap
&& !prollyMutMapIsEmpty(pPending)))
&& !(treeFound && treeCmp==0) ){
+ /* findMatchingMutMapEntry's internal sqlite3VdbeRecordCompare
+ ** calls reset and re-set pIdxKey->eqSeen on every probe,
+ ** clobbering the flag that the tree-scan loop above set.
+ ** OP_SeekGE with BTREE_SEEK_EQ relies on that eqSeen to detect
+ ** an exact prefix match; without save/restore, the second
+ ** same-table DELETE inside a SAVEPOINT silently bails out of
+ ** seek_not_found. */
+ int savedEqSeen = pIdxKey->eqSeen;
rc = findMatchingMutMapEntry((ProllyMutMap*)pCur->pMutMap,
pCur->pKeyInfo,
pIdxKey, pSortKey, nSortKey,
@@ -4462,6 +4470,9 @@
return rc;
}
}
+ if( !mutE ){
+ pIdxKey->eqSeen = savedEqSeen;
+ }
if( mutE ){

const u8 *pMutVal = mutE->pVal;
61 changes: 61 additions & 0 deletions .github/doltlite-patches/mergeScan-check-tree-delete.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
Check mutmap for tree-key DELETE when emitting a tree entry in mergeScan

prollyBtCursorIndexMoveto can position the merge cursor directly at a
mutmap INSERT entry via setCursorToMutMapEntryPhys, which jumps mmIdx
past any sort-earlier mutmap entries. The subsequent mergeScan
iteration, on seeing `cmp*dir < 0` (tree entry ahead of mutmap[mmIdx]),
emits the tree entry without noticing that the mutmap contains a
DELETE for that tree key at an earlier order index. The tree key was
logically deleted but stays visible to the scan — SQLite then follows
its rowid and calls TableMoveto to a rowid that mutmap says is
deleted, which hits the "database disk image is malformed" check in
sqlite3VdbeFinishMoveto.

Fix: when about to emit a tree entry, look up the current tree key in
the mutmap; if a DELETE record exists, skip the tree entry and advance
the tree cursor. No behavior change in the common case (no matching
mutmap entry → emit tree as before), constant-factor overhead
otherwise.

Reproduces deterministically on:

CREATE TABLE c (ka TEXT, kb TEXT, kc TEXT, v TEXT,
PRIMARY KEY (ka, kb, kc));
INSERT INTO c VALUES ('db','t','a','');
BEGIN IMMEDIATE;
INSERT INTO c VALUES ('db','t','b','');
DELETE FROM c WHERE ka='db' AND kb='t' AND kc='a';
UPDATE c SET v = 'x' WHERE ka='db' AND kb='t'; -- malformed w/o fix

--- a/src/prolly_btree.c
+++ b/src/prolly_btree.c
@@ -3631,6 +3631,29 @@
}
cmp = mergeCompare(pCur, e);
if( cmp*dir < 0 ){
+ /* Tree entry is ahead of mutmap[mmIdx] in scan direction.
+ ** Check whether the mutmap has a DELETE entry for the tree
+ ** key at an order index the iteration has already walked
+ ** past (e.g. after setCursorToMutMapEntryPhys jumped mmIdx
+ ** directly to a later INSERT). Without this check the scan
+ ** would emit a logically-deleted tree row and SQLite would
+ ** later TableMoveto a rowid that mutmap says is gone. */
+ ProllyMutMapEntry *delE = 0;
+ int delRc;
+ if( pCur->curIntKey ){
+ delRc = prollyMutMapFindRc(pCur->pMutMap, 0, 0,
+ prollyCursorIntKey(&pCur->pCur), &delE);
+ }else{
+ const u8 *pK; int nK;
+ prollyCursorKey(&pCur->pCur, &pK, &nK);
+ delRc = prollyMutMapFindRc(pCur->pMutMap, pK, nK, 0, &delE);
+ }
+ if( delRc!=SQLITE_OK ) return delRc;
+ if( delE && delE->op==PROLLY_EDIT_DELETE ){
+ int advRc = advanceTreeCursor(pCur, dir);
+ if( advRc!=SQLITE_OK ) return advRc;
+ continue;
+ }
pCur->mergeSrc = MERGE_SRC_TREE;
if( pRes ) *pRes = 0;
return SQLITE_OK;
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
Preserve the original record when the sort-key encoding is lossy

For a non-INTKEY cursor, prollyBtCursorInsert normally picks between
two storage shapes:

- splitKey=1 — sort key holds a prefix of the record and the rest of
the record is stored as the value (WITHOUT ROWID tables, secondary
indexes);
- splitKey=0 — sort key holds every field and no value is stored.

On read, getCursorPayload prefers the stored value; if none is
present it reconstructs a SQLite record from the sort key bytes. That
round-trip is lossless for BINARY collations but the sort-key encoder
in sortkey.c lower-cases A-Z under NOCASE and strips trailing spaces
under RTRIM.

When SQLite picks the auto-index as a covering index for a plain
column read — e.g. `SELECT pkcol FROM t` on a composite-PK rowid
table with a NOCASE pkcol — OP_Column reads the reconstructed (folded)
bytes from getCursorPayload and returns them in place of the original
column value. Reproduces on:

CREATE TABLE t (ID INT, pkcol TEXT COLLATE NOCASE, extra TEXT,
PRIMARY KEY (ID, pkcol));
INSERT INTO t VALUES (1, 'Johnny', 'x');
SELECT pkcol FROM t; -- 'johnny' (wrong) without fix
SELECT * FROM t; -- 'Johnny' (right) both ways

Fix: when any field in pKeyInfo uses a collation whose sort-key
encoding is not a bijection over the original bytes (NOCASE, RTRIM),
preserve the original record as the value alongside the sort key.
getCursorPayload then prefers the original bytes over reconstruction.
No behavior change when all collations are BINARY: the value side
stays empty and storage footprint is unchanged.

--- a/src/prolly_btree.c
+++ b/src/prolly_btree.c
@@ -4716,7 +4716,33 @@
pCur->pKeyInfo,
&pSortKey, &nSortKey);
if( rc==SQLITE_OK ){
- if( splitKey ){
+ /* When every field of the record is folded into the sort key
+ ** (splitKey==0) we would normally store no value and, on read,
+ ** reconstruct the SQLite record from the sort key. That is
+ ** lossless for BINARY collations but the sort-key encoder
+ ** lower-cases A-Z under NOCASE and strips trailing spaces
+ ** under RTRIM. When SQLite picks the auto-index as a covering
+ ** index for a plain column read (e.g. `SELECT pkcol FROM t`
+ ** on a composite-PK rowid table with NOCASE pkcol), OP_Column
+ ** reads the reconstructed (folded) bytes and returns them to
+ ** the caller in place of the original column value. Preserve
+ ** the original record as the value whenever any field in
+ ** pKeyInfo uses a lossy collation so the read path prefers
+ ** the original bytes over the reconstructed ones. */
+ int lossy = 0;
+ if( pCur->pKeyInfo ){
+ int i;
+ for( i = 0; i < pCur->pKeyInfo->nAllField; i++ ){
+ CollSeq *pColl = pCur->pKeyInfo->aColl[i];
+ if( pColl && pColl->zName
+ && ( sqlite3StrICmp(pColl->zName, "NOCASE")==0
+ || sqlite3StrICmp(pColl->zName, "RTRIM")==0 ) ){
+ lossy = 1;
+ break;
+ }
+ }
+ }
+ if( splitKey || lossy ){
rc = prollyMutMapInsert(pCur->pMutMap,
pSortKey, nSortKey, 0,
(const u8*)pPayload->pKey, (int)pPayload->nKey);
Loading
Loading