clucene: dovecot/indexer-worker Segmentation fault
Environment
We are running Dovecot in a Alpine docker container, with the dovecot-fts-lucene plugin. (depends on clucene package) On a regular basis the dovecot/indexer-worker(s) crashes, leaving the index files corrupted. This slowly kills our IMAP servers to the point it is incapable of opening the dovecot.index* files and no longer serves mail.
Versions / build
Original issues arised on Alpine 3.8, with the packages up to date. However, for debugging this issue I’ve used Alpine edge. Inside the container I’ve installed the alpine-sdk, pulled the git repository and rebuild the following packages, including sub-packages:
- clucene-2.3.3.4-r5
- dovecot-2.3.3-r0
- musl-1.1.20-r2
For those packages. I also created the $pkgname-dbg sub-package, in order to can run a gdb trace. In order not to get notes in the trace I also disable optimizations and -fomit-frame-pointer in apkbuild.conf:
export CFLAGS="-O0"
export CXXFLAGS="$CFLAGS"
export CPPFLAGS="$CFLAGS"
export LDFLAGS="-Wl,--as-needed"
Thus bug was still reproducible after rebuilding and installing with this flags
Symptoms
Error logs like:
imap_1 | Dec 22 12:31:53 indexer-worker(me@domain.com)<23187><mjl58Jp9RrrAqMsG:tvtGKbguHlyTWgAALwGixg>: Error: lucene index /mail/admin@usrpro.io/lucene-indexes: IndexWriter() failed (#1): Lock obtain timed out
imap_1 | Dec 22 12:31:53 indexer-worker(me@domain.com)<23187><mjl58Jp9RrrAqMsG:tvtGKbguHlyTWgAALwGixg>: Error: Mailbox INBOX: Mail search failed: Internal error occurred. Refer to server log for more information. [2018-12-22 12:31:52]
imap_1 | Dec 22 12:31:53 indexer-worker(me@domain.com)<23187><mjl58Jp9RrrAqMsG:tvtGKbguHlyTWgAALwGixg>: Error: Mailbox INBOX: Transaction commit failed: FTS transaction commit failed: backend deinit (attempted to index 1 messages (UIDs 1412..1412))
imap_1 | Dec 22 12:31:53 indexer: Error: Indexer worker disconnected, discarding 1 requests for admin@usrpro.io
imap_1 | Dec 22 12:31:53 indexer-worker(me@domain.com)<23187><mjl58Jp9RrrAqMsG:gN2TLLkuHlyTWgAALwGixg>: Fatal: master: service(indexer-worker): child 23187 killed with signal 11 (core dumped)
Backtrace
I used the dumped core file to run a full backtrace against /usr/libexec/dovecot/indexer-worker. Backtrace is attached to this issue.
(from redmine: issue id 9779, created on 2018-12-22)
- Uploads:
- gdb.txt Output of gdb's bt full