SearchEngine::Book and SearchEngine::Author.
Overview
- Declare associations on your model:
belongs_to :authorand/orhas_many :books. - Compile adds a field
reference: “authors.id”forauthor_idin thebooksschema (belongs_to only;hasdoes not emit references). - Query uses
assoc(…)in selection. - Single‑hop only: no multi‑hop (
b.field) joins. - Reindex rules: updates to referenced docs (e.g.,
Author.name) don’t require reindexing referencers; changing join keys or denormalized copies does. - Cascade: single hop, detects A ↔ B cycles and skips those pairs.
- Reference validation: during
Schema.apply!the gem resolvesreference:targets via alias when present, but gracefully falls back to the physical name if no alias exists. Aliases are still recommended for blue/green deploys, yet physical‑only setups are supported.
Declaring references with the association DSL
- The schema for
booksincludes the fieldauthor_idwithreference: “authors.id”. - Types must be compatible (
author_idandauthors.idconsistently typed as string/int64). - Array keys (
[:string]/[:integer]) model one‑to‑many (e.g.,promotion_ids → promotions.id).
SearchEngine::Author if you need to query from Authors to Books:
Querying with joins
- Select from joined collections:
$authors(first_name,last_name),id,title.
- Filter / sort on joined fields:
- Call
.joins(:assoc)before referencing$assoc.field. - Only single hop (
$assoc.field) is supported; deeper paths are not. - Unknown fields raise with suggestions when attributes are declared.
What is stored and how references are resolved
- The model DSL compiles a field‑level
reference: “<target_collection>.<foreign_key>”(or;asyncvariant) on the referencer’s local key forbelongs_toassociations. This lives in the Typesense collection schema and is used by the server to resolve JOINs at query time.
Asynchronous references
belongs_to :author, async_ref: trueencodesreference: “authors.id;async”in the schema, allowing ingestion when theauthorsdocument is not yet present.- Values you store in the referencer (e.g.,
author_id) must match the target field type and value (e.g.,authors.id). The engine does not maintain any separate link table. - For arrays of keys, use
[:string]/[:integer]on the local side; selection returns arrays of joined documents when applicable.
When do you need to reindex?
Reindex referencers only when necessary. Quick rules:- No reindex needed when a referenced document’s non‑key attributes change.
- Example: updating
Author.nameis immediately visible via joins inBookqueries.
- Example: updating
- Reindex referencers when:
- You change the join key value on the referencer (e.g., a book’s
author_id). - You store denormalized copies of referenced fields inside the referencer (e.g.,
author_nameonbooksforquery_by). - You change schema in ways that affect join keys/field types.
- You change the join key value on the referencer (e.g., a book’s
Schema.apply!(blue/green) handles reindexing for the collection you are applying. Other collections don’t need reindex solely because of alias swaps; joins are resolved by the server using the latest data in the referenced collection.
Cascading and bi‑directional joins
The engine includes a cascade helper that discovers references (from live Typesense schemas or the compiled registry) and triggers reindexing of immediate referencers when a referenced document is updated.- Single hop only; no transitive chaining.
- Cycle guard: immediate A ↔ B cycles are detected and skipped to avoid ping‑pong.
- Partial vs full: when safe (ActiveRecord source, no custom Partitioner), the engine performs a targeted partial rebuild of the referencer using the foreign key; otherwise it falls back to a full rebuild for that referencer.
Many‑to‑many pattern
Model a bridge collection with two references (one to each side). Queries can join through the bridge in a single hop from the base to the bridge. Multi‑hop (base → bridge → other) is not supported by the JOINs DSL; denormalize or run separate searches when you need multi‑hop.Best practices
- Declare joins only where you actually need to query across collections.
- Prefer selection with minimal
$assoc(…)fields to reduce payload and speed hydration. - Avoid denormalizing referenced fields unless you need them for
query_by/ranking; denormalization increases reindexing needs. - Keep join keys typed consistently on both sides (
int64↔int64,string↔string).
Troubleshooting
- Unknown association: declare it via
join :name, collection:, local_key:, foreign_key:on the base model. - Join not applied: call
.joins(:assoc)before selecting or filtering on$assoc.field. - Unknown joined field: verify the target collection’s attributes; suggestions are provided.
- Cycle skipped: expected with A ↔ B; use a manual targeted rebuild when you truly need to refresh referencers.