Skip to content

main/pcre2: remove --with-match-limit-depth option

This is currently causing a failure in musl/alpine testsuite in GLib: https://gitlab.gnome.org/GNOME/glib/-/issues/3159

The value of that option has remained unchanged since the package was introduced with version 10.21 in 2016. And the API documentation specifically mentions that this option is a lot less useful since changes implemented in 10.30 and 10.32 So I'm asking if this could be removed.

The way I'm reading the documentation, there seems to be very little benefit on this at this point: https://pcre.org/current/doc/html/pcre2api.html:

int pcre2_set_depth_limit(pcre2_match_context *mcontext, uint32_t value);

This parameter limits the depth of nested backtracking in pcre2_match(). Each time a nested backtracking point is passed, a new memory "frame" is used to remember the state of matching at that point. Thus, this parameter indirectly limits the amount of memory that is used in a match. However, because the size of each memory "frame" depends on the number of capturing parentheses, the actual memory limit varies from pattern to pattern. This limit was more useful in versions before 10.30, where function recursion was used for backtracking.

The depth limit is not relevant, and is ignored, when matching is done using JIT compiled code. However, it is supported by pcre2_dfa_match(), which uses it to limit the depth of nested internal recursive function calls that implement atomic groups, lookaround assertions, and pattern recursions. This limits, indirectly, the amount of system stack that is used. It was more useful in versions before 10.32, when stack memory was used for local workspace vectors for recursive function calls. From version 10.32, only local variables are allocated on the stack and as each call uses only a few hundred bytes, even a small stack can support quite a lot of recursion.

If the depth of internal recursive function calls is great enough, local workspace vectors are allocated on the heap from version 10.32 onwards, so the depth limit also indirectly limits the amount of heap memory that is used. A recursive pattern such as /(.(?2))((?1)|)/, when matched to a very long string using pcre2_dfa_match(), can use a great deal of memory. However, it is probably better to limit heap usage directly by calling pcre2_set_heap_limit().

The default value for the depth limit can be set when PCRE2 is built; if it is not, the default is set to the same value as the default for the match limit. If the limit is exceeded, pcre2_match() or pcre2_dfa_match() returns PCRE2_ERROR_DEPTHLIMIT. A value for the depth limit may also be supplied by an item at the start of a pattern of the form

Merge request reports