Skip to main content
Related: Models · Indexer · Deletion

Upserting documents

Use the model-level helpers on SearchEngine::Base to insert or replace documents without hand-rolling JSONL payloads or touching the indexer. These helpers reuse the same schema and mapper validations as .create, ensuring that documents remain consistent with the compiled Typesense schema.
HelperPurposeReturns
SearchEngine::Book.upsert(record: …)Map a single source record and upsert itInteger (0 or 1)
SearchEngine::Book.upsert(data: …)Validate pre-mapped data and upsert itInteger (0 or 1)
SearchEngine::Book.upsert_bulk(records: …)Map many source records and import in one JSONL batchHash summary
SearchEngine::Book.upsert_bulk(data: …)Validate an array of mapped hashes and import onceHash summary

Single-document flow

book = Book.find(42)
SearchEngine::Book.upsert(record: book)      # mapper runs

mapped = SearchEngine::Book.mapped_data_for(book)
SearchEngine::Book.upsert(data: mapped)         # assume already mapped
  • Provide either record: or data: (passing both raises InvalidParams).
  • The helper ensures an id is present by running the model’s identify_by strategy when necessary.
  • doc_updated_at and hidden flags (*_empty, *_blank) are always refreshed before import.
  • Returns 1 when the Typesense import endpoint reports success, 0 when no document was sent.

Bulk flow

records = Book.where(publisher_id: publisher.id).limit(100)
summary = SearchEngine::Book.upsert_bulk(records: records)
# => {
#      collection: "books",
#      docs_count: 100,
#      success_count: 100,
#      failure_count: 0,
#      bytes_sent: 12_480,
#      response: "..." # raw Typesense response (string or array)
#    }
payloads = books.map { |b| SearchEngine::Book.mapped_data_for(b) }
SearchEngine::Book.upsert_bulk(data: payloads)
Bulk helpers stream a single JSONL payload through SearchEngine::Client#import_documents with action: :upsert. Input can be any enumerable—Arrays, batch enumerators, or ActiveRecord scopes. Internally the helper normalizes the enumerable, validates each document against the schema, and computes doc_updated_at before encoding. The returned summary mirrors the indexer’s import statistics:
  • collection – physical collection chosen via alias resolution (respects into: and partition: overrides when provided).
  • docs_count – number of documents encoded into the JSONL payload.
  • success_count / failure_count – counts reported by the Typesense client (currently assumed to be equal to docs_count when no per-document errors are surfaced).
  • bytes_sent – size of the JSONL payload in bytes.
  • response – raw response object from Typesense (string of status lines or an array of hashes).
Wrap bulk imports in background jobs or maintenance tasks when you need to backfill a handful of documents without triggering a full indexer run.

Validation & mapping

  • Records are mapped via the compiled mapper (SearchEngine::Mapper.for(klass)) so the same schema rules as full indexation apply (required fields, coercions, hidden flags).
  • Pre-mapped data must be provided as Hashes with string or symbol keys; any other shape raises InvalidParams.
  • When both records: and data: are omitted, an InvalidParams error is raised.
  • Invalid document shapes (missing required fields, unknown keys when strict mode is on, invalid types) raise the same SearchEngine::Errors::InvalidParams or InvalidField exceptions you see during indexing.

Options

Both single and bulk helpers accept the same optional keywords:
  • into: – override the physical collection (defaults to alias resolution or configured partition resolver).
  • partition: – forward partition metadata to the resolver (pairs nicely with blue/green apply flows).

When to choose upsert helpers

  • Small batches: perfect for synchronising a handful of documents during admin workflows, background jobs, or diagnostics.
  • Ad-hoc replays: retry a small set of failed records without replaying the entire indexer.
  • Hybrid flows: combine with .mapped_data_for when your application already builds the mapped document (e.g., for audit snapshots).
For mapping source data without upserting to Typesense, use .from(data, mode: :hash) to get mapped hashes, or .from(data, mode: :instance) to get hydrated model instances. See Models → Mapping source data for details.

Comparison: Create vs Upsert

Feature.create.upsert / .upsert_bulk
API call/documents (single)/documents/import?action=upsert
Document count1 at a time1 (single helper) or many (bulk)
Mapper usageOptional (hash accepted)Mandatory for record: inputs; always validates
doc_updated_atauto-setauto-set
Return valueHydrated model instanceStatus counts / raw response

Mermaid overview