Skip to content

testing/vector: upgrade to 0.32.1

Krassy Boykinov requested to merge chereskata/aports:vector into master
Changelog:

______________________________
0.32.1:

This patch release contains a fix for a regression in 0.32.0 and fixes a few issues with the release artifacts.

3 bug fixes

    A number of sinks were emitting incorrect telemetry for the component_sent_* metrics:
        WebHDFS
        GCP Cloud Storage
        AWS S3
        Azure Blob Storage
        Azure Monitor Logs
        Databend
        Clickhouse
        Datadog Logs

    This has been corrected.
    The newly added --openssl-legacy-provider flag in 0.32.0 can now be disabled by setting it to false via --openssl-legacy-provider=false. Previously it would complain of extra arguments.
    The opentelemetry source no longer fails to decode large payloads. This was a regression in 0.31.0 when a 4 MB limit was inadvertently applied.

Known issues

    A number of sinks emit incorrect telemetry for the component_sent_* metrics:
        WebHDFS
        GCP Cloud Storage
        AWS S3
        Azure Blob Storage
        Azure Monitor Logs
        Databend
        Clickhouse
        Datadog Logs

    This is fixed in v0.32.1.
    The newly added --openssl-legacy-provider flag cannot actually be disabled by setting it to false via --openssl-legacy-provider=false. Instead it complains of extra arguments. This is fixed in v0.32.1.

______________________________
0.32.0:
7 enhancements

    The clickhouse sink database and table options are now templatable.
    The prometheus_scrape source now scrapes configured targets in parallel. Thanks to nullren for contributing this change!
    The prometheus_scrape source now has a scrape_timeout_secs option to configure how long Vector should wait for each request. Thanks to nullren for contributing this change!
    Vector’s debian Docker images are now based on Debian 12 (Bookworm).
    Vector sources that support codecs now support protobuf as an option. A Protobuf descriptor file must also be provided to use to decode the data. Thanks to Daniel599 for contributing this change!

    VRL’s encrypt and decrypt functions now support additional algorithms:
        CHACHA20-POLY1305
        XCHACHA20-POLY1305
        XSALSA20-POLY1305
        AES-*-CTR-BE (to disambiguate endianess of AES-*-CTR)
        AES-*-CTR-LE (to disambiguate endianess of AES-*-CTR)
    Thanks to alisa101rs for contributing this change!
    The nats source and sink have been switched to use a more modern NATS library to lay the groundwork for a JetStream source. Thanks to paolobarbolini makarchuk for contributing this change!

2 new features

    A new greptimedb sink was added allowing Vector to send metrics to GreptimeDB. Thanks to sunng87 for contributing this change!
    Configuration fields that are field lookups (such as log_schema.timestamp_key) are now parsed at boot-time rather than run-time. In addition to better performance, this also means that invalid paths return an error at start time rather than being silently ignored at runtime.

16 bug fixes

    The lua transform now sets the source_id metadata to its own component ID if an event is emitted by the transform that has no origin source_id (e.g. events constructed in the transform itself).
    VRL conditions included in configurations (e.g. the filter transform) are now checked at boot-time to ensure that they return a boolean instead of treating all non-boolean return values as false .
    The vector sink now considers DataLoss responses to be hard errors at indicates the a sink in the downstream vector source rejected the data. The vector sink will now not retry these errors and also reject them in any connected sources (when acknowledgements are enabled). Thanks to sbalmos for contributing this change!
    The vector sink now correctly applies configured HTTPS proxy settings. Previously it would fail to validate the downstream certificate. Thanks to joemiller for contributing this change!
    The splunk_hec source now treats the fields on incoming events as “flat” rather than interpreting them as field paths. For example, an incoming foo.bar field is now inserted as {"foo.bar": "..."} rather than {"foo": {"bar": "..."}}. This avoids panics that were caused by invalid paths.
    Fractional second configuration options are now correctly parsed as fractional. Previously they would round to the nearest second. Thanks to sbalmos for contributing this change!
    The Vector API can now correctly be disabled during reload by setting api.enabled to false. Thanks to KH-Moogsoft for contributing this change!
    The component_received_event_bytes_total and component_sent_event_bytes_total metrics for sinks are now calculated after any encoding.only_fields or encoding.except_fields options are applied.
    The websocket sink now correctly sends data as binary for “binary” codecs: raw, native, and avro. Previously it would always interpret the bytes as text (UTF-8). Thanks to zhongchen for contributing this change!
    The syslog source now correctly handles escape sequences appearing the structured data segment.
    Numeric compression levels can now be set when using TOML. Previously Vector would fail to parse the configuration. This already worked for YAML and JSON configurations.
    Sinks that support Adaptive Request Concurrency options now support configuring an initial_concurrency to start the concurrency limit at rather than starting at a limit of 1. Thanks to blake-mealey for contributing this change!
    VRL’s encode_logfmt function now escapes all values including =s.
    VRL’s parse_nginx_log function now handles more combined formats. Thanks to scMarkus for contributing this change!
    The azure_blob_storage sink now sets the correct content-type based on the configured encoding options. Thanks to stemjacobs for contributing this change!
    The vector source no longer fails to decode large payloads. This was a regression in 0.31.0 when a 4 MB limit was inadvertently applied.

______________________________
0.31.0:

12 enhancements

    The aws_s3 source now support bucket notifications in SQS that originated as SNS messages. It still does not support receiving SNS messages directly. Thanks to sbalmos for contributing this change!
    A from_unix_timestamp function was added to VRL to decode timestamp values from unix timestamps. This deprecates the to_timestamp function, which will be removed in a future release.
    The parse_nginx_log function now supports ingress_upstreaminfo as a format.
    The format_timestamp function now supports an optional timezone argument to control the timezone of the encoded timestamp.
    Vector’s graceful shutdown time limit is now configurable (via --graceful-shutdown-limit-secs) and able to be disabled (via --no-graceful-shutdown-limit). See the CLI docs for more.
    Support for zstd compression was added to sinks support compression. Thanks to akoshchiy for contributing this change!
    The prometheus_remote_write sink now supports zstd and gzip compression in addition to snappy (the default). Thanks to zamazan4ik for contributing this change!
    The journald source now supports a journal_namespace option to restrict the namespace of the units that the source consumes logs from.
    The gelf, native_json, syslog, and json decoders (configurable as decoding.codec on sources) now have corresponding options for lossy UTF-8 decoding via decoding.<codec name>.lossy = true|false. This can be used to accept invalid UTF-8 where invalid characters are replaced before decoded.
    The aws_kinesis_firehose and aws_kinesis_streams sinks are now able to retry requests with partial failures by setting request_retry_partial to true. The default is false to avoid writing duplicate data if proper event idempotency is not in place. Thanks to dengmingtong for contributing this change!
    The component_sent_event_bytes_total and component_sent_event_total metrics can now optionally have a service and source tag added to them, driven from event data, from the added telemetry global config options. This can be used to break down processing volume by service and source.
    The internal_metrics and internal_logs sources now shutdown last in order to capture as much telemetry as possible during Vector shutdown.

13 bug fixes

    The fluent source now correctly sends back message acknowledgements in msgpack rather than JSON. Previously fluentbit would fail to process them. Thanks to ChezBunch for contributing this change!
    VRL now supports the \0 null byte escape sequence in strings.
    The statsd sink now correctly encodes all counters as incremental, per the spec.
    A disk buffer deadlock that occurred on start-up after certain crash conditions was fixed.
    The http_client no longer corrupts binary data by always trying to interpret as UTF-8 bytes. Instead options were added to encoders for lossy UTF-8 decoding (see above entry).
    The Proxy-Authorization header is now added to to HTTP requests from components that support HTTP proxies when authentication is used. Thanks to syedriko for contributing this change!
    Vector now exits non-zero if the graceful shutdown time limit expires before Vector finishes shutting down.

    The following components now log template render errors at the warning level rather than error and does not increment component_errors_total. This fixes a regression in v0.30.0 for the loki sink.
        loki sink
        papertrail sink
        splunk_hec_logs sink
        splunk_hec_metrics sink
        throttle transform
        log_to_metric transform
    The datadog_metrics sink now incrementally encodes sketches. This avoids issues users have seen with sketch payloads exceeding the limits and being dropped.
    The datadog_agent reporting of events and bytes received was fixed so it no longer double counted incoming events.
    log_schema global configuration fields can now appear in a different file than defined sources. Thanks to Hexta for contributing this change!
    Vector now supports running greater than 512 sources. Previously it would lock up if more than 512 file sources were defined. Thanks to honganan for contributing this change!
    Internal metrics for the Adaptive Concurrency Request module are now correctly tagged with component metadata like other sink metrics (component_kind, component_id, component_type).

______________________________
0.30.0:

10 enhancements

    The pulsar sink supports a few new features:
        Dynamic topics using a topic template
        Can receive both logs and metrics
        Dynamic message properties can be set via properties_key

    This brings functionality in-line with that which is supported by the kafka sink.
    Thanks to addisonj for contributing this change!
    The kubernetes_logs source supports a new use_apiserver_cache option to have requests from Vector hit the Kubernetes API server cache rather than always hitting etcd. It can significantly reduce Kubernetes control plane memory pressure in exchange for a chance of receiving stale data. Thanks to nabokihms for contributing this change!
    The appsignal sink now allows configuration of TLS options via the tls config field. This brings it in-line with other sinks that support TLS. Thanks to tombruijn for contributing this change!
    The amqp sink now allows configuration of the content_encoding and content_type message properties via the new properties configuration option. Thanks to arouene for contributing this change!
    The docker_logs source now supports usage of the tcp:// scheme for the host option. The connection is the same as-if the http:// scheme was used. Thanks to OrangeFlag for contributing this change!
    Vector’s distroless libc docker images (tags ending in -distroless-libc) are now based on Debian 11 rather than Debian 10. This matches Vector’s published Debian images (tags ending in -debian). Thanks to SIPR-octo for contributing this change!
    The aws_s3 source and aws_s3 sink now have full support for codecs and can receive/send any event type allowing aws_s3 to be used as a transport layer between Vector instances.
    The tag_cardinality_limit now includes the metric_name field on logs it produces to more easily identify the metric that was limited. Thanks to nomonamo for contributing this change!
    HTTP-based sinks now log the underlying error if an unexpected error condition is hit. This makes debugging easier.
    AWS components now allow configuring auth.region without any of the other authentication options so that a different region can be given to the default authentication provider chain than the region that the component is otherwise connecting to.

7 bug fixes

    Disk buffers now recover from partial writes that can occur during unclean shutdowns.
    The influxdb_logs sink now correctly encodes logs when tags are present. Thanks to juvenn for contributing this change!
    The loki sink now warns when added labels collide via wildcard expansion. Thanks to hargut for contributing this change!
    The elasticsearch sink now uses the correct API to automatically determine the version of the downstream Elasticsearch instance (when api_version = "auto"). Thanks to syedriko for contributing this change!
    The gcp_stackdriver_metrics sink now correctly refreshes the authentication token before it expires.
    Vector’s internal logs were updated to use “suppress” rather than “rate-limit” in the hopes that it makes it clearer that it is only Vector’s log output that is being suppressed, rather than data processing being throttled.
    The kafka source now attempts to send any pending acknowledgements to the Kafka server before reading additional messages to process.
Edited by Krassy Boykinov

Merge request reports