feat(orm): add fuzzy search and relevance ordering (PostgreSQL)#2573
feat(orm): add fuzzy search and relevance ordering (PostgreSQL)#2573docloulou wants to merge 2 commits intozenstackhq:devfrom
Conversation
… only) - Introduced fuzzy search operators (`fuzzy`, `fuzzyContains`) in the ORM. - Added `RelevanceOrderBy` type for sorting based on fuzzy search relevance. - Implemented fuzzy search filters in PostgreSQL dialect. - Added error handling for unsupported fuzzy search features in MySQL and SQLite dialects. - Updated Zod schema factory to include fuzzy search fields. - Created a new `Flavor` model in the schema for testing purposes.
📝 WalkthroughWalkthroughAdds fuzzy text search operators and relevance-based ordering to the ORM: types, Zod schemas, dialect extension points and implementations (Postgres), unsupported stubs (MySQL/SQLite), client query builder updates, test schema/model additions, and a comprehensive PostgreSQL-only e2e test suite. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
packages/orm/src/client/crud/dialects/postgresql.ts (1)
561-590: Well-implemented PostgreSQL fuzzy search using pg_trgm.The implementation correctly uses:
- Trigram similarity operator (
%) forfuzzy- Word similarity operator (
<%) forfuzzyContainswith proper operand orderingGREATEST()aggregation for multi-field relevance scoringThe use of
sqltemplate tags is appropriate here as these are PostgreSQL-specific operators not available in Kysely's type-safe API. Thesqltemplate is Kysely's escape hatch mechanism.Note: Extension dependencies (
pg_trgmandunaccent) are already documented in the type definitions (crud-types.ts). Consider adding runtime error handling if extensions are missing, similar to thecreateNotSupportedErrorpattern used for MySQL/SQLite, to provide users with a clearer message instead of a generic PostgreSQL error.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/orm/src/client/crud/dialects/postgresql.ts` around lines 561 - 590, Add runtime checks for the required PostgreSQL extensions and throw a clear user-facing error if missing: implement an internal check (e.g., ensurePostgresExtensionsAvailable) that queries pg_extension for 'pg_trgm' and 'unaccent' and call it from the PostgreSQL dialect initialization or lazily before using fuzzy features; update buildFuzzyFilter, buildFuzzyContainsFilter, and buildRelevanceOrderBy to call this check (or ensure it's called beforehand) and throw a createNotSupportedError-style error with a clear message and remediation steps if either extension is absent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/orm/src/client/crud-types.ts`:
- Around line 912-930: Update the RelevanceOrderBy type and its JSDoc to match
runtime behavior: change the _relevance.fields type from plain array to a
NonEmptyArray<NonRelationFields<Schema, Model>> so an empty fields list is
rejected at the type level, and revise the comment for _relevance to indicate
that relevance uses PostgreSQL similarity() (and that MySQL is not supported /
throws NotSupported at runtime) so IntelliSense reflects actual provider
constraints; locate the RelevanceOrderBy type and the _relevance field
declaration to make these edits.
In `@packages/orm/src/client/crud/dialects/base-dialect.ts`:
- Around line 1110-1131: The _relevance branch adds complex ordering but cursor
pagination still assumes simple {field: 'asc'|'desc'} entries; update handling
so cursor with a _relevance order is either rejected early or supported: modify
the code path that constructs cursor filters (function buildCursorFilter) to
detect order entries where field === '_relevance' (created via
buildRelevanceOrderBy / buildFieldRef / negateSort) and generate a comparison
that first compares computed relevance (value.search against the same fields)
then applies a deterministic tie-breaker (e.g., primary key) in the same sort
direction, or alternatively throw a clear validation error when a cursor is
supplied alongside an _relevance order; ensure tests cover both rejection and
correct SQL generation if you implement support.
In `@packages/orm/src/client/zod/factory.ts`:
- Around line 1180-1192: The _relevance.fields enum is currently built from all
scalar fields (scalarFieldNames) which allows non-string types; change the
scalarFieldNames computation in the getModelFields/filter pipeline to include
only string-typed scalar fields (e.g., filter by the field metadata indicating
type === 'String' or equivalent in your field definition) so that
_relevance.fields contains only string fields, and keep the z.enum(...) usage
but fed from the new string-only scalarFieldNames; update the code around
getModelFields, scalarFieldNames, and the _relevance strictObject construction
to reflect this restriction.
---
Nitpick comments:
In `@packages/orm/src/client/crud/dialects/postgresql.ts`:
- Around line 561-590: Add runtime checks for the required PostgreSQL extensions
and throw a clear user-facing error if missing: implement an internal check
(e.g., ensurePostgresExtensionsAvailable) that queries pg_extension for
'pg_trgm' and 'unaccent' and call it from the PostgreSQL dialect initialization
or lazily before using fuzzy features; update buildFuzzyFilter,
buildFuzzyContainsFilter, and buildRelevanceOrderBy to call this check (or
ensure it's called beforehand) and throw a createNotSupportedError-style error
with a clear message and remediation steps if either extension is absent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5c89d679-7173-415f-83ce-5738308b98ee
📒 Files selected for processing (12)
packages/orm/src/client/constants.tspackages/orm/src/client/crud-types.tspackages/orm/src/client/crud/dialects/base-dialect.tspackages/orm/src/client/crud/dialects/mysql.tspackages/orm/src/client/crud/dialects/postgresql.tspackages/orm/src/client/crud/dialects/sqlite.tspackages/orm/src/client/zod/factory.tstests/e2e/orm/client-api/fuzzy-search.test.tstests/e2e/orm/schemas/basic/input.tstests/e2e/orm/schemas/basic/models.tstests/e2e/orm/schemas/basic/schema.tstests/e2e/orm/schemas/basic/schema.zmodel
- _relevance.fields restreint aux champs String dans le schéma Zod - Rejet du cursor pagination combiné avec _relevance ordering - Type RelevanceOrderBy restreint aux StringFields avec tuple non-vide - JSDoc mis à jour pour refléter le support PostgreSQL uniquement
|
Hi @docloulou , thanks for this amazing PR, very useful feature and well implemented! I'm wondering if you're fine with delaying it to release v3.7 or 3.8. Asking this because, although not directly related, it's a bit odd to support fuzzy search but not regular full text search (a feature gap from Prisma). I hope to get FTS implemented, probably in 3.7, and we can have this feature either together or in a subsequent minor release. What do you think? |
|
No problem for me. If the code in this PR looks solid to you, it can serve as a good template for adding the FTS feature. The main things left to handle would be adding the Note: one thing to watch out for - in this PR I’m using _relevance (as Prisma does for FTS) for the fuzzy search, so there could be a conflict. |
Summary
fuzzyandfuzzyContainsfilter operators for String fields inwhereclauses, using PostgreSQL'spg_trgmextension withunaccentfor accent-insensitive trigram matching_relevanceordering inorderByto sort results by fuzzy similarity score, supporting single and multiple fieldsNotSupportederrors for these operatorsNew API
Prerequisites (PostgreSQL)
The user must enable the following extensions in their PostgreSQL database:
Files changed
packages/orm/src/client/crud-types.tsfuzzy,fuzzyContainsinStringFilter;RelevanceOrderBytypepackages/orm/src/client/constants.tsFuzzyfilter kind inFILTER_PROPERTY_TO_KINDpackages/orm/src/client/crud/dialects/base-dialect.tsfuzzy,fuzzyContains,_relevancein filter/orderBy builders; 3 abstract methodspackages/orm/src/client/crud/dialects/postgresql.tspg_trgm+unaccentimplementation (%,<%,similarity(),GREATEST())packages/orm/src/client/crud/dialects/mysql.tsNotSupportederrorspackages/orm/src/client/crud/dialects/sqlite.tsNotSupportederrorspackages/orm/src/client/zod/factory.tsfuzzy,fuzzyContains,_relevancetests/e2e/orm/schemas/basic/schema.zmodelFlavormodeltests/e2e/orm/client-api/fuzzy-search.test.tsImplementation details
fuzzyfilterUses PostgreSQL trigram similarity operator
%withunaccentandlowerfor accent-insensitive, case-insensitive matching:fuzzyContainsfilterUses PostgreSQL word similarity operator
<%to check if the search term is approximately contained as a substring:_relevanceorderingUses
similarity()function for single fields,GREATEST()for multiple fields:Test plan
_relevanceordering (single field, multiple fields, with pagination)Documentation : zenstackhq/zenstack-docs#596
Summary by CodeRabbit
New Features
Tests