Lightning.Invocation.DataclipSearchVectorWorker (Lightning v2.16.8-pre)
View SourceBackfills the full-text search_vector on dataclips rows.
Dataclips are inserted with search_vector left NULL; the vector is built
here rather than on the insert path. Building it inline was risky:
jsonb_to_tsvector over a large dataclip body is slow and runs inside the
transaction that persists the run, so a slow (or failing) vector build could
roll back the dataclip insert and lose the run (#4800). Deferring it keeps
jsonb_to_tsvector off that hot path. Search is eventually consistent as a
result, typically catching up within a minute.
Two database objects support this: safe_jsonb_to_tsvector(regconfig, jsonb),
which builds the vector from the dataclip body while tolerating NULL and
oversized input, and a partial index over search_vector IS NULL, which keeps
locating pending rows cheap as the table grows. Vectors use the
english_nostop config to match the read side (Lightning.Invocation), which
queries with to_tsquery('english_nostop', ...).
Each run drains pending rows newest-first, in batches up to a per-run budget
(batch size and max batches are configurable via Lightning.Config). A run
that exhausts its budget leaves backlog behind
and enqueues an immediate follow-up ("snowball"); otherwise the minute-ly cron
tick keeps pace. The worker shares the search_indexing queue with
Lightning.LogLines.SearchVectorWorker; that queue runs at concurrency 2, so
the two workers each get a slot and their snowball chains never starve one
another. The cron tick and the snowball carry distinct trigger args, so job
uniqueness allows one of each to queue but never a duplicate.