Commit 727245f
fix: UTF-8 char boundary panics in smart chunker + schema migration order
Two bugs found during real vault testing:
1. Smart chunker panicked on multi-byte UTF-8 chars (em dash, etc.)
when byte offsets from break-point scoring landed inside multi-byte
sequences. Fixed by snapping all byte offsets to valid char
boundaries before slicing.
2. Schema migration failed on existing v0.1 databases: the SCHEMA
constant tried to CREATE INDEX on docid column before migration
added it. Moved index creation into the migration path so it
runs after the column exists.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent b738b6f commit 727245f
2 files changed
Lines changed: 23 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
163 | 163 | | |
164 | 164 | | |
165 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
166 | 176 | | |
167 | 177 | | |
168 | 178 | | |
| |||
205 | 215 | | |
206 | 216 | | |
207 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
208 | 222 | | |
209 | 223 | | |
210 | 224 | | |
| |||
252 | 266 | | |
253 | 267 | | |
254 | 268 | | |
255 | | - | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
256 | 273 | | |
257 | 274 | | |
258 | 275 | | |
| |||
267 | 284 | | |
268 | 285 | | |
269 | 286 | | |
| 287 | + | |
270 | 288 | | |
271 | 289 | | |
272 | 290 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
63 | 63 | | |
64 | 64 | | |
65 | 65 | | |
66 | | - | |
67 | | - | |
68 | 66 | | |
69 | 67 | | |
70 | 68 | | |
| |||
130 | 128 | | |
131 | 129 | | |
132 | 130 | | |
133 | | - | |
| 131 | + | |
134 | 132 | | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
135 | 136 | | |
136 | 137 | | |
137 | 138 | | |
| |||
0 commit comments