The format is based on Keep a Changelog, and this project adheres to Semantic Versioning. See MAINTAINERS.md for instructions to keep up to date.
Operators, you should copy/paste content of this content straight to your project. It is written and meant to be copied over to your project.
If you were at firehose-core version 1.0.0 and are bumping to 1.1.0, you should copy the content between those 2 version to your own repository, replacing placeholder value fire{chain} with your chain's own binary.
- Added
--shift-portsglobal flag that shifts all Firehose service port numbers by a given offset, useful for running multiple instances on the same machine without port conflicts. Both server listen addresses and internal client connection addresses are shifted so wiring stays consistent. Infrastructure ports (Prometheus metrics, pprof, log-level-switcher) are also shifted. Example:fire{chain} start --shift-ports 100shifts all ports by +100. - Added '--merger-max-merging-threads' (defaults: 4) so that the merger can merge blocks in parallel (still using way less RAM than previous one-block-preloading method)
- S3 store: configurable HTTP connection pool via
DSTORE_S3_MAX_IDLE_CONNS,DSTORE_S3_MAX_IDLE_CONNS_PER_HOST,DSTORE_S3_IDLE_CONN_TIMEOUTenv vars
- Removed parallel preloading of one-block-files to reduce RAM usage when merging big blocks.
Note
With this change, HEAD block timestamp is now updated maximum every 5 seconds instead of at every block, by reading the first 500 bytes of the last one-block-file.
- S3 store: fixed goroutine leak caused by connection pool exhaustion on single-host S3 stores (e.g. MinIO); HTTP body is now explicitly drained and closed, and the transport is configured with
MaxIdleConnsPerHost=100by default
- Fix substreams support for requests with 'application/grpc-web*' content-type (old connectweb library)
- Add 'mindreader stats' info log every 30secs with performance metrics
- Add
firecore tools networks listcommand to display registered networks from The Graph Networks Registry with their Firehose and Substreams endpoints. Supports--name-onlyflag for listing only network IDs and--onlyflag for filtering networks using a regular expression. - Add
substreams-tier2-authenticatorflag to specify the authenticator to use for tier2 requests. Can be 'trust://' (default, same as previous behavior) or 'secret://' - Add
substreams-tier1-subrequests-secret-keyflag to specify the secret key to use for tier1 subrequests authentication when using 'secret://' authenticator on tier2 - Add
reader-node-grpc-secret-keyflag to specify the secret key to use for reader node gRPC authentication - Add
?secret=...parsing torelayer-sources - Add Prometheus metrics for reader test mode: track blocks compared, success/failure counts, and success/failure percentages for easy monitoring at interval stats.
- Refactor reader test mode Prometheus metrics to fix incorrect success/failure percentage calculation caused by unaccounted blocks. Renamed
blocks_matched_total->blocks_compared_matched_totalandblocks_mismatched_total->blocks_compared_mismatched_total. Theblocks_compared_totalmetric now counts only blocks that were fully compared (matched + mismatched). Added three new metrics:blocks_seen_total(all attempted blocks),blocks_reorg_total(skipped due to re-org/ID mismatch), andblocks_fetch_failure_total(failed to fetch from production). Invariants:blocks_seen == blocks_reorg + blocks_fetch_failure + blocks_comparedandblocks_compared == blocks_compared_matched + blocks_compared_mismatched.
- Fix substreams/firehose endpoints detection of supported compression: do not fail on 'algo;q=x.y' syntax
- Fix substreams tier2 jobs behind load balancer: will now retry forever on 'Unavailable: no healthy upstream' errors
- Fix relayer failing to get back to live if reader blocks are unlinkable after a long period, and merger has removed one-blocks: it will now shutdown in that case, so it can be restarted.
- RPC V4 protocol with
BlockScopedDatasbatching: MultipleBlockScopedDatamessages are now batched into a singleBlockScopedDatasresponse, reducing gRPC round-trips and message framing overhead during backfill. Clients automatically fall back V4 → V3 → V2 when connecting to older servers, so no flag changes are required. - S2 compression is now the default: Replaces gzip as the default compression algorithm, providing ~3-5x faster compression/decompression with comparable ratios. The client automatically negotiates compression with the server.
- VTProtobuf fast serialization: Both client and server now use vtprotobuf for protobuf marshaling/unmarshaling, providing ~2-3x faster serialization with reduced memory allocations.
- Server-side message buffering: Configurable via
--substreams-tier1-output-buffer-sizeflag (default: 100 blocks) orMESSAGE_BUFFER_MAX_DATA_SIZEenvironment variable (default: 10MB). - Improved Connect/gRPC protocol selection: Server now efficiently routes requests to the appropriate handler based on content-type, improving performance by ~15% for pure gRPC clients.
- New blocks from last partial: "Last partial blocks" are now accepted interchangeably with
newblocks, allowing faster full blocks for requests that do not ask for partial blocks.
- Add
firecore tools substreams logs connections <user_id>command to query Cloud Logging and show Substreams connections for an organization. Correlates incoming requests with stats by trace ID and presents a summary table showing active, closed, and error connections with details like network, source IP, module, duration, and blocks processed.
- Remove alpha partial blocks support in firehose service (only exposed via substreams)
- Substreams: Improved 'partial blocks': support new pbbstream's "LastPartial" field, fix 'undo' scenarios for stores
- Substreams: Improved 'partial blocks': support new pbbstream's "LastPartial" field, fix 'undo' scenarios for stores
- Reduce RAM usage with partial blocks (relayer, substreams, firehose)
- Prevent panic if transactionTrace.receipt is nil in LogFilter (even if it is not a normal scenario)
- Bump Golang to build to 1.25
- Fix issue where a retry on dstore while writing a fullKV would corrupt the file, making it unreadable. Fix prevents this and also now deletes affected files when they are detected
- Fix bug where so request could get stuck forever (until the clients drops or server restarts).
- Fix issue where transient HTTP/2 stream errors (e.g.,
INTERNAL_ERROR) fromdstorewere being treated as fatal errors instead of being retried. These transient network errors are now detected and retried with exponential backoff.
- Added bucketed prometheus metrics
head_block_relative_time_sumto help investigate latency and pipeline performance:- "app=firehose_output" and "app=substreams_output" that shows latency between outputing live blocks and their blocktime.
- "app=relayer" for latency at relayer's input
- "app=reader-node" for latency at reader's input
- Bump base64 library to use a much faster one in reader
- dstore: bumped google storage lib to v1.59.1 to fix a bug in their multi-range downloader, in case it affects us
- Fixed underflow in 'FailedPrecondition desc = request needs to process a total of x blocks' error when running from 'substreams run' with a start-block in the future.
- Fix issue where "live backfiller" would not create segments after reconnecting with a cursor starting from a previous quicksave, causing delays in future reconnection
- Prevent "panic" when log messages are too large: instead, they will be truncated with a 'some logs were truncated' message.
- Raise max individual log message size from 128k to 512k
- Raise max log message size for a full block from 128k to 5MiB
- Reduce log level from Warn to Debug when we fail to get or set the store size (for backends that don't support it)
- Removed PartialsData message and brought back this data inside the good old BlockScopedData
- added the following fields to BlockScopedData:
- bool
is_partialto indicate if this block is a partial block. The following two fields are only present whenis_partial==true - optional bool
is_last_partialto indicate if this is the last partial of a given block (with correct block hash) - optional uint32
partial_indexto indicate the index of this partial block within the full block
- bool
- renamed
partial_blocks_onlyflag topartial_blockson substreams Blocks request - removed
include_partial_blocksflag from substreams Blocks request
- Migrated from AWS SDK for Go v1 to v2 (
github.com/aws/aws-sdk-go→github.com/aws/aws-sdk-go-v2)
- Added support for "workload identity credentials" in Azure. Order of preference is:
- If
AZURE_STORAGE_KEYis set, use shared key credential (previous behavior)- Otherwise, use DefaultAzureCredential which supports:
- Managed Identity (for Azure resources)
- Service Principal (via AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_TENANT_ID)
- Azure CLI credentials
- Visual Studio Code credentials
- Otherwise, use DefaultAzureCredential which supports:
- If
Added experimental support for partial blocks (e.g. Flashblocks on Base)
See https://docs.substreams.dev/reference-material/chains-and-endpoints/flashblocks for details about how they work in Substreams.
SUBSTREAMS_BIGGEST_PARTIAL_BLOCK_INDEXenvironment variable now specifies the index to use when bundling the "last partial block" from the full block. (default: 10, for Base)- Added flag
--include-partial-blocksontools firehose-client
- Substreams: fixed issue where "live backfiller" would not create segments after reconnecting with a cursor starting from a previous quicksave, causing delays in future reconnection
- Fixed substreams regression in v1.12.2 where some jobs would not get scheduled correctly, resulting in failure with the mssage
get size of store "...": opening file: not found.
Important
This version contains a bug in scheduling of substreams stages which can cause some requests to fail with the message get size of store "...": opening file: not found.
Operators are advised to upgrade to v1.12.3 as soon as possible.
firecore tools firehose-clientandfirecore tools firehose-single-block-clientnow accepts Network Registry ID or aliases directly.- fixed
firecore tools download-from-firehosecursor handling to avoid erroneous "this endpoint is serving blocks out of order" issues.
- Fix egress bytes calculation when running in noop or dev mode with specified output debug modules.
- Add support to foundation store v2 protocol.
- Reduced memory usage when loading large stores
- Added opt-in memory limits related to loading FullKV stores, gated by environment variables:
- "SUBSTREAMS_STORE_SIZE_LIMIT_PER_REQUEST" (default allows 5GiB:
5368709120): limit size of all loaded stores for a single request, in bytes. Set to a numeric value in bytes. - "SUBSTREAMS_ENFORCE_STORE_SIZE_LIMIT_PER_REQUEST" (default false): if set to
true, enforce the limit above instead of just logging a warning - "SUBSTREAMS_TOTAL_STORE_SIZE_LIMIT_PERCENT" (default: 75): limit the size in-memory of all loaded stores concurrently on the instance, in percentage of usable memory (cgroup or system total -- regardless of free or available)
- "SUBSTREAMS_ENFORCE_TOTAL_STORE_SIZE_LIMIT" (default: false): if set to
true, enforce the limit above instead of just logging a warning
- "SUBSTREAMS_STORE_SIZE_LIMIT_PER_REQUEST" (default allows 5GiB:
- Fixed an edge case where substreams with modules depending on stores that start on the future would fail and incorrectly report an error about "tier2 version being incompatible"
- Added
firehose_session_denied_counterwithreasonlabel, increment each time a session is refused with the reason why it was refused.
- Fix a panic (nil pointer) when skipping blocks via indexes on stores on tier2
- Add store size to substreams starts
- Add store and foundational-store list to incoming request stats
- This new endpoint removes the need for complex "mangling" of the package on the client side.
- Instead of expecting
sf.substreams.v1.Modules(with the client having to apply parameters, network, etc.), thesf.substreams.rpc.v3.Requestnow expects:- a
sf.substreams.v1.Package. - a
map<string, string>ofparams - the
networkstring which will all be applied to the package server-side.
- a
- It returns the same object as the v2 endpoint, i.e. a stream of
sf.substreams.rpc.v2.Response - It is added on top of the existing 'v2' endpoint, both being active at the same time.
- To enable it, operators will simply need to ensure that their routing allows the
/sf.substreams.rpc.v3.Stream/*path. - Cached spkg on the server will now contain protobuf definitions, simplifying debugging of user requests.
- Emitted metrics for requests can now be
sf.substreams.rpc.v3/Blocksinstead of alwayssf.substreams.rpc.v2/Blocks, make sure that your metering endpoint can support it.
Note: recent substreams clients will support both endpoints, first trying the v3 and automatically falling back to v2 if they hit a "404 Not Found" or "Not Implemented" error.
- Fixed a bug with BlockFilter: a skipped module would send BlockScopedData (in dev or near HEAD, to follow progress) with an empty module name, breaking some sinks. Module name was present if requesting a module dependent on that skipped module. Now the module name is always included.
-
Improved panic message when reader node encounter a block whose finality is bigger than the block itself to include
lib_num,block_num,distance, andmax_distancefor easier debugging. -
Updated
firehose-networksdependency tov0.2.2(latest). -
Fixed
common-one-block-store-urlflag not expanding environment variables in all apps.
- Updated Wasmtime runtime from v30.0.0 to v36.0.0, bringing performance improvements, inlining support, Component Model async implementation, and enhanced security features.
- Added WASM bindgen shims support for Wasmtime runtime to handle WASM modules with WASM bindgen imports (when Substreams Module binary is defined as type
wasm/rust-v1+wasm-bindgen-shims). - Added support for foundational-store (in wasmtime and wazero).
- Added foundational-store grpc client to substreams engine.
- Fixed module caching to properly handle modules with different runtime extensions.
- 'paymentgateway' metering plugin renamed to
tgm, now supports theindexer-api-keyparameter.
-
Concurrent streams and workers limits are now handled under the new session plugin, available under
common-session-pluginargument. -
The following flags were removed, now handled by that session plugin
substreams-tier1-global-worker-pool-addresssubstreams-tier1-global-request-pool-addresssubstreams-tier1-global-worker-pool-keep-alive-delaysubstreams-tier1-global-request-pool-keep-alive-delaysubstreams-tier1-default-max-request-per-usesubstreams-tier1-default-minimal-request-life-time-second
-
To use thegraph.market as a session plugin, use:
--common-session-plugin=tgm://session.thegraph.market:443?indexer-api-key={your-api-key}(requires specific indexer API key) see https://github.com/streamingfast/tgm-gateway/tree/develop/session for details on the various flags -
To use simple local session management, use:
--common-session-plugin=local://?max_sessions=30&max_sessions_per_user=3&max_workers_per_user=10&max_workers_per_session=10see https://github.com/streamingfast/dsession/tree/main/local for details on those flags -
Note: The 'max_sessions' parameter from the
common-session-pluginis now also used to limit the number of firehose streams. -
If you were using a custom GRPC implementation for
--substreams-tier1-global-worker-pool-addressand--substreams-tier1-global-request-pool-address(ex: localhost:9010), simply use this value for the session plugin:--common-session-plugin=tgm://localhost:9010?plaintext=true, it is compatible.
- Fix a slow memory leak around metering plugin on tier2
- Add a maximum execution time for a full tier2 segment. By default, this is 60 minutes. It will fail with
rpc error: code = DeadlineExceeded desc = request active for too long. It can be configured from the --substreams-tier2-segment-execution-timeout flag - Fix
subscription channel at max capacityerror: when the LIVE channel is full (ex: slow module execution or slow client reader), the request will be continued from merged files instead of failing, and gracefully recover if performance is restored. - Improve log message for 'request active for a long time', adding stats.
- Improved how
firecore tools --output=protojsonandfirecore tools --output=jsonrenderspbbstream.Blocktype now printing the underlying chain's specific block.
-
Fix thread leak on filereader.
-
If
--advertise-chain-nameis sey,substreams-tier1app will now infer default--substreams-tier1-block-typevalue by using chain's name and extracting chain's block type Protobuf package id, which will fix some cases wheresubstreams-tier1waits for 100 blocks before starting up.
People using their own authentication layer will need to consider these changes before upgrading!
- Renamed config headers that come from authentication layer:
x-sf-user-idrenamed tox-user-id(from dauth module)x-sf-api-key-idrenamed tox-api-key-id(from dauth module)x-sf-metarenamed tox-meta(from dauth module)x-sf-substreams-parallel-jobsrenamed tox-substreams-parallel-workers
- Allow decreasing
x-substreams-parallel-workersthrough an HTTP headers (auth layer determines higher bound) - Detect value for the 'stage layer parallel executor max count' based on the
x-plan-tierheader (removedx-sf-substreams-stage-layer-parallel-executor-max-counthandling)
- Added
tgm://auth.thegraph.market?indexer-api-key=<API_KEY>&reissue-jwt-max-age-secs=600plugin that allows an indexer to use The Graph Market as the authentication source. An API key with special "indexer" feature is needed to allow repeated calls to the API without rate limiting (for Key-based authentication and reissuance of "untrusted long-lived JWTs").
- Added mechanism to immediately cancel pending requests that are doing an 'external call' (ex: eth_call) on a given block when it gets forked out (UNDO because of a reorg).
- Fixed handling of invalid module kind: prevent heavy logging from recovered panic
- Error considered deterministic which will cache the error forever are now suffixed with
<original message> (deterministic error).
- [OPERATORS] Tier2 servers must be upgraded BEFORE tier1 servers
- tier2 servers will now stream outputs for the 'first segment', to speed up time to first block
- Progress notifications will only be sent every 500ms for the first minute, then reduce rate up to every 5 seconds (can be overridden per request)
- Return 'processed blocks' counter to client at the end of the request
- Added
dev_output_modulesto protobuf request (if present, in dev mode, only send the output of the modules listed) - Added
progress_messages_interval_msto protobuf request (if present, overrides the rate of progress messages to that many milliseconds)
[Broken release, do not use]
- This release is a hotfix for a thread leak leading to a slow memory leak.
Rework the execout File read/write to improve memory efficiency:
-
This reduces the RAM usage necessary to read and stream data to the user on tier1, as well as to read the existing execouts on tier2 jobs (in multi-stage scenario)
-
The cached execouts need to be rewritten to take advantage of this, since their data is currently not ordered: the system will automatically load and rewrite existing execout when they are used.
-
Code changes include:
- new FileReader / FileWriter that "read as you go" or "write as you go"
- No more 'KV' map attached to the File
- Split the IndexWriter away from its dependencies on execoutMappers.
- Clock distributor now also reads "as you go", using a small "one-block-cache"
-
Removed
SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENTenv var (since this is not a RAM issue anymore) -
Add
uncompressed_egress_bytesfield tosubstreams request statslog message
- (dstore) Add storageClass query parameter for s3:// urls on stores (@fschoell)
- Update the firehose-beacon proto to include the new Electra spec in the 'well-known' protobuf definitions (@fschoell)
- Use The Graph's Network Registry to recognize chains by genesis blocks and fill the 'advertise' server on substreams/firehose
- Tier2 jobs now write mapper outputs "as they progress", preventing memory usage spikes when saving them to disk.
- Tier2 jobs now limit writing and loading mapper output files to a maximum size of 8GiB by default.
- Tier2 jobs now release existingExecOuts memory as blocks progress
- Speed up DeleteByPrefix operations on all tiers (5x perf improvement on some heavy substreams)
- Added
SUBSTREAMS_OUTPUT_SIZE_LIMIT_PER_SEGMENT(int) environment variable to control this new limit. - Added
SUBSTREAMS_STORE_SIZE_LIMIT(uint64) env var to allow overwriting the default 1GiB value - Added
SUBSTREAMS_PRINT_STACK(bool) env var to enable printing full stack traces when caught panic occurs - Added
SUBSTREAMS_DEBUG_API_ADDR(string) environment variable to expose a "debug API" HTTP interface that allows blocking connections, running GC, listing or canceling active requests. - Prevent a deterministic failure on a module definition (mode, valueType, updatePolicy) from persisting when the issue is fixed in the substreams.yaml streamingfast/substreams#621
- Metering events on tier2 now bundled at the end of the job (prevents sending metering events for failing jobs)
- Added metering for: "processed_blocks" (block * number of stages where execution happened) and "egress_bytes"
- (RAM+CPU) dedupe execution of modules with same hash but different name when computing dependency graph. (#619)
- (RAM) prevent memory usage burst on tier2 when writing mapper by streaming protobuf items to writer
- Tier1 requests will no longer error out with "service currently overloaded" because tier2 servers are ramping up
- Add
reader-node-firehosewhich creates one-blocks by consuming blocks from an already existing Firehose endpoint. This can be used to set up an indexer stack without having to run an instrumented blockchain node, or getting redundancy from another firehose provider.
- Bumped grpc-go lib to 1.72.0
- Now building
amd64andarm64Docker images on push & release.
-
Flag
--reader-node-argumentsnow accepts to expand{first-streamable-block}with the value defined by config flag--common-first-streamable-block. -
Flag
--reader-node-argumentswill now expand environment variables if present within the string.
- Bump substreams to v1.15.2
- fix the 'quicksave' feature on substreams (incorrect block hash on quicksave)
- Save deterministic failures in WASM in the module cache (under a file named
errors.0123456789.zstat the failed block number), so further requests depending on this module at the same block can return the error immediately without re-executing the module.
- Fix a panic when a module times out on tier2 while being executed from cached outputs
- Add environment variables to control retry behavior, "SUBSTREAMS_WORKER_MAX_RETRIES" (default 10) and "SUBSTREAMS_WORKER_MAX_TIMEOUT_RETRIES" (default 2), changing from previous defaults (720 and 3) The worker_max_timeout_retries is the number of retries specifically applied to block execution timing out (ex: because of external calls)
- The mechanism to slow down processing segments "ahead of blocks being sent to user" has been disabled on "noop-mode" requests, since these requests are used to pre-cache data and should not be slowed down.
- The "number of segments ahead" in this mechanism has been increased from
>number of parallel workers>to<number of parallel workers> * 1.5 - Tier2 now returns GRPC error codes for
DeadlineExceededwhen it times out, andResourceExhaustedwhen a request is rejected due to overload - Tier1 now correctly reports tier2 job outcomes in the
substreams request stats - Added jitter in "retry" logic to prevent all workers from retrying at the same time when tier2 are overloaded
- Bugfix for panics on some requests
- Properly reject requests with a stop-block below the "resolved" StartBlock (caused by module initialBlocks or a chain's firstStreamableBlock)
- Added the
resolved-start-blockto thesubstreams request statslog
-
Fixed
runtime error: slice bounds out of rangeerror on heavy memory usage with wasmtime engin -
Added a validation on a module for the existence of 'triggering' inputs: the server will now fail with a clear error message when the only available inputs are stores used with mode 'get' (not 'deltas'), instead of silenlty skipping the module on every block.
- Added a mechanism for 'production-mode' requests where the tier1 will not schedule tier2 jobs over { max_parallel_subrequests } segments above the current block being streamed to the user. This will ensure that a user slowly reading blocks 1, 2, 3... will not trigger a flood of tier2 jobs for higher blocks, let's say 300_000_000, that might never get read.
- Improved connection draining on shutdown: Now waits for the end of the 'shutdown-delay' before draining and refusing new connections, then waits for 'quicksaves' and successful signaling of clients, up to a max of 30 sec.
- Added information about the number of blocks that need to be processed for a given request in the
sf.substreams.rpc.v2.SessionInitmessage - Added an optional field
limit_processed_blocksto thesf.substreams.rpc.v2.Request. When set to a non-zero value, the server will reject a request that would process more blocks than the given value with theFailedPreconditionGRPC error code. - Improved error messages when a module execution is timing out on a block (ex: due to a slow external call) and now return a
DeadlineExceededConnect/GRPC error code instead of a Internal. Removed 'panic' from wording. - In 'substreams request stats' log, add fields:
remote_jobs_completed,remote_blocks_processedandtotal_uncompressed_read_bytes
- Fix another
cannot resolve 'old cursor' from files in passthrough mode -- not implementedbug when receiving a request in production-mode with a cursor that is below the "linear handoff" block
-
Rust modules will now be executed with
wasmtimeby default instead ofwazero.- Prevents the whole server from stalling in certain memory-intensive operations in wazero.
- Speed improvement: cuts the execution time in half in some circumstances.
- Wazero is still used for modules with
wbindgenand modules compiled withtinygo. - Set env var
SUBSTREAMS_WASM_RUNTIME=wazeroto revert to previous behavior.
-
Implement "QuickSave" feature to save the state of "live running" substreams stores when shutting down, and then resume processing from that point if the cursor matches.
- Added flag
substreams-tier1-quicksave-storeto enable quicksave when non-empty (requires--common-system-shutdown-signal-delayto be set to a long enough value to save the in-flight stores)
- Added flag
-
The
firecore tools print one-blockis now able to print from a file directly. -
Added
firecore tools relayer stream <endpoint> [(+<count>|<stopBlock>)]to connect to a relayer component through gRPC and stream data out, output controlled bytools --outputflag.
-
Integrated the
GlobalRequestPoolservice in theTier1Appto manage global requests pooling. -
Integrated the
GlobalWorkerPoolservice in theTier1Appto manage global worker pooling. -
Added flag
substreams-tier1-global-worker-pool-address, the address of the global worker pool to use for the substreams tier1. (disabled if empty) -
Added flag
substreams-tier1-global-worker-pool-keep-alive-delaydelay between two keep alive call to the global worker pool. Default is 25s") -
Added flag
substreams-tier1-global-request-pool-keep-alive-delaydelay between two keep alive call to the global worker pool for request. Default is 25s -
Added flag
substreams-tier1-default-max-request-per-userdefault max request per user, this will be use of the global worker pool is not reachable. Default is 5 -
Added flag
substreams-tier1-default-minimal-request-life-time-seconddefault minimal request life time, this will be use of the global worker pool is not reachable. . Default is 180 -
Limit parallel execution of a stage's layer: Previously, the engine was executing modules in a stage's layer all in parallel. We now change that behavior, development mode will from now on execute every sequentially and when in production mode will limit parallelism to 2 (hard-coded) for now. The auth plugin can control that value dynamically by providing a trusted header
X-Sf-Substreams-Stage-Layer-Parallel-Executor-Max-Count.
-
Add shared cache for tier1 execution near HEAD, to prevent multiple tier1 instances from reprocessing the same module on the same block when it comes in (ex: foundational modules)
-
Improved fetching of state caches on tier1 requests to speed up "time to first data"
-
Fixed a regression since "v1.7.3" where the SkipEmptyOutput instruction was ignored in substreams mappers
- make 'compare-blocks' command support one-blocks stores as well as merged-blocks
- Bump
substreamslib tov1.12.3- Improved logging of requests beginning/end
- Improved
noopmode (now sends less data)
- Bump
substreamslib tov1.12.2- fix panic when using an index that allows
skip_empty_output
- fix panic when using an index that allows
-
Fixed
substreams-tier2not setting itself ready correctly on startup sincev1.7.0. -
Added support for
--output=bytesmode which prints the chain's specific Protobuf block as bytes, the encoding for the bytes string printed is determined by--bytes-encoding, useshexby default. -
Added back
-oas shortand for--outputinfirecore tools ...sub-commands.
- Add back
grpc.health.v1.Healthservice tofirehoseandsubstreams-tier1services (regression in 1.7.0) - Give precedence to the tracing header
X-Cloud-Trace-ContextoverTraceparentto prevent user systems' trace IDs from leaking passed a GCP load-balancer
- Reader Node Manager HTTP API now accepts
POST http://localhost:10011/v1/restart<?sync=true>to restart the underlying reader node binary sub-process. This is a alias for/v1/reload.
- Enhanced
firecore tools print merged-blockswith various small quality of life improvements:- Now accepts a block range instead of a single start block.
- Passing a single block as the block range will print this single block alone.
- Block range is now optional, defaulting to run until there is no more files to read.
- It's possible to pass a merged blocks file directly, with or without an optional range.
Important
This release will reject firehose connections from clients that don't support GZIP or ZSTD compression. Use --firehose-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.
Important
This release removes the old sf.firehose.v1 protocol (replaced by sf.firehose.v2 in 2022, this should not affect any reasonably recent client).
- Add support for ConnectWeb firehose requests.
- Always use gzip compression on firehose requests for clients that support it (instead of always answering with the same compression as the request).
-
The
substreams-tier1app now has two new configuration flags named respectivelysubstreams-tier1-active-requests-soft-limitandsubstreams-tier1-active-requests-hard-limithelping better load balance active requests across a pool oftier1instances.The
substreams-tier1-active-requests-soft-limitlimits the number of client active requests that a tier1 accepts before starting to be report itself as 'unready' within the health check endpoint. A limit of 0 or less means no limit.This is useful to load balance active requests more easily across a pool of tier1 instance. When the instance reaches the soft limit, it will start to be unready from the load balancer standpoint. The load balancer in return will remove it from the list of available instances, and new connections will be routed to remaining clients, spreading the load.
The `substreams-tier1-active-requests-hard-limit` limits the number of client active requests that a tier1 accepts beforerejecting incoming gRPC requests with 'Unavailable' code and setting itself as unready. A limit of 0 or less means no limit.
This is useful to prevent the tier1 from being overwhelmed by too many requests, most client auto-reconnects on 'Unavailable' code so they should end up on another tier1 instance, assuming you have proper auto-scaling of the number of instances available.
-
The
substreams-tier1app now exposes a new Prometheus metricsubstreams_tier1_rejected_request_counterthat tracks rejected requests. The counter is labelled by the gRPC/ConnectRPC returned code (okandcanceledare not considered rejected requests). -
The
substreams-tier2app now exposes a new Prometheus metricsubstreams_tier2_rejected_request_counterthat tracks rejected requests. The counter is labelled by the gRPC/ConnectRPC returned code (okandcanceledare not considered rejected requests). -
Properly accept and compress responses with
gzipfor browser HTTP clients using ConnectWeb withAccept-Encodingheader -
Allow setting subscription channel max capacity via
SOURCE_CHAN_SIZEenv var (default: 100)
- Fix an issue preventing proper detection of gzip compression when multiple headers are set (ex: python grpc client)
- Add support for zstd compression on server
- Fix an issue preventing some tier2 requests on last-stage from correctly generating stores. This could lead to some missing "backfilling" jobs and slower time to first block on reconnection.
- Fix a thread leak on cursor resolution resulting in bad counter for active connections
Note
This release will reject substreams connections from clients that don't support GZIP compression. Use --substreams-tier1-enforce-compression=false to keep previous behavior, then check the logs for incoming Substreams Blocks request logs with the value compressed: false to track users who are not using compressed HTTP connections.
- Substreams: add
--substreams-tier1-enforce-compressionto reject connections from clients that do not support GZIP compression - Substreams performance: reduced the number of
mallocs(patching some third-party libraries) - Substreams performance: removed heavy tracing (that wasn't exposed to the client)
- Fixed
reader-node-line-buffer-sizeflag that was not being respected inreader-node-stdinapp - Well-known chains: change genesis block for near-mainnet from 9820214 to 9820210
- BlockPoller library: reworked logic to support more flexible balancing strategy
firehose-grpc-listen-addrandsubstreams-tier1-grpc-listen-addrflags now accepts comma-separated addresses (allows listening as plaintext and snakeoil-ssl at the same time or on specific ip addresses)- removed old
RegisterServiceExtensionimplementation (not used anywhere anymore) - rpc-poller lib: fix fetching the first block on an endpoint (was not following the cursor, failing unnecessarily on non-archive nodes)
- Bump
substreamsanddmeteringto latest version adding theoutputModuleHashto metering sender.
Note All caches for stores using the updatePolicy
set_sum(added in substreams v1.7.0) and modules that depend on them will need to be deleted, since they may contain bad data.
- Fix bad data in stores using
set_sumpolicy: squashing of store segments incorrectly "summed" some values that should have been "set" if the last event for a key on this segment was a "sum" - Fix small bug making some requests in development-mode slow to start (when starting close to the module initialBlock with a store that doesn't start on a boundary)
-
[Operator] Node Manager HTTP
/v1/resumecall now acceptsextra-env=<key>=<value>&extra-env=<keyN>=<valueN>enabling to override environment variables for the next restart only. Usecurl -XPOST "http://localhost:10011/v1/resume?sync=true&extra-env=NODE_DEBUG=true"(changelocalhost:10011accordingly to your setup).This is not persistent upon restart!
-
[Metering] Revert undesired Firehose metric
Endpointchanges, the correct new value used issf.firehose.v2.Firehose/Blocks(had been mistakenly set tosf.firehose.v2.Firehose/Blockbetween version v1.6.1 and v1.6.4 inclusively).
- Fixed an(other) issue where multiple stores running on the same stage with different initialBlocks will fail to proress (and hang)
- Fix "cannot resolve 'old cursor' from files in passthrough mode" error on some requests with an old cursor
- Fix handling of 'special case' substreams module with only "params" as its input: should not skip this execution (used in graph-node for head tracking)
-> empty files in module cache with hash
d3b1920483180cbcd2fd10abcabbee431146f4c8should be deleted for consistency - Fix bug where some invalid cursors may be sent (with 'LIB' being above the block being sent) and add safeguard/loggin if the bug appears again
- Fix panic in the whole tier2 process when stores go above the size limit while being read from "kvops" cached changes
- fix: reader-node-stdin not shutting down after receiving an EOF
- [Operator] The flag
--advertise-block-id-encodingnow accepts shorter form:hex,base64, etc. The older longer formBLOCK_ID_ENCODING_HEXis still supported but we suggested using the shorter form from now on.
Note Since a bug that affected substreams with "skipping blocks" was corrected in this release, any previously produced substreams cache should be considered as possibly corrupted and be eventually replaced
- Substreams: fix bad handling of modules with multiple inputs when only one of them is filtered, resulting in bad outputs in production-mode.
- Substreams: fix stalling on some substreams with stores and mappers with different start block numbers on the same stage
- Substreams: fix 'development mode' and LIVE mode executing some modules that should be skipped
- Bump substreams to v1.10.0: Version 1.10.0 adds a new
EndpointInfo/Infoendpoint, introduces a 3-minute default execution timeout per block, updates metering metrics with a deprecation warning, enhancessubstreams initcommands, and improves wasm module caching and Prometheus tool flexibility. Full changelog: https://github.com/streamingfast/substreams/releases/tag/v1.10.0 - Metering update: more detailed metering with addition of new metrics. DEPRECATION WARNING:
bytes_readandbytes_writtenmetrics will be removed in the future, please use the new metrics for metering instead
-
Add
sf.firehose.v2.EndpointInfo/Infoservice on Firehose andsf.substreams.rpc.v2.EndpointInfo/Infoto Substreams endpoints. This involves the following new flags:advertise-chain-nameCanonical name of the chain according to https://thegraph.com/docs/en/developing/supported-networks/ (required, unless it is in the "well-known" list)advertise-chain-aliasesAlternate names for that chain (optional)advertise-block-featuresList of features describing the blocks (optional)advertise-block-id-encodingEncoding format of the block ID [BLOCK_ID_ENCODING_BASE58, BLOCK_ID_ENCODING_BASE64, BLOCK_ID_ENCODING_BASE64URL, BLOCK_ID_ENCODING_HEX, BLOCK_ID_ENCODING_0X_HEX] (required, unless the block type is in the "well-known" list)ignore-advertise-validationRuntime checks of chain name/features/encoding against the genesis block will no longer cause server to wait or fail.
-
Add a well-known list of chains (hard-coded in
wellknown/chains.goto help automatically determine the 'advertise' flag values). Users are encouraged to propose Pull Requests to add more chains to the list. -
The new info endpoint adds a mandatory fetching of the first streamable block on startup, with a failure if no block can be fetched after 3 minutes and you are running
firehoseorsubstreams-tier1service. It validates the following on a well-known chain:- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g.
matic, it will refuse another value foradvertise-chain-namethanmaticor one of its aliases (polygon) - If the first-streamable-block does not match any known chain, it will require the
advertise-chain-nameto be non-empty - If the first-streamable-block type is unknown (i.e. not ethereum, solana, near, cosmos, bitcoin...), it will require the user to provide
advertise-chain-nameas well asadvertise-block-id-encoding
- if the first-streamable-block Num/ID match the genesis block of a known chain, e.g.
-
Substreams: add
--common-tmp-dirflag and activate local caching of pre-compiled WASM modules through wazero feature -
Substreams: revert module hash calculation from
v1.5.5, when using a non-zero firstStreamableBlock. Hashes will now be the same even if the chain's first streamable block affects the initialBlock of a module. -
Substreams: add
--substreams-block-execution-timeoutflag (default 3 minutes) to prevent requests stalling -
Metering update: more detailed metering with addition of new metrics (
live_uncompressed_read_bytes,live_uncompressed_read_forked_bytes,file_uncompressed_read_bytes,file_uncompressed_read_forked_bytes,file_compressed_read_forked_bytes,file_compressed_read_bytes). DEPRECATION WARNING:bytes_readandbytes_writtenmetrics will be removed in the future, please use the new metrics for metering instead.
- Bump substreams to v1.9.3: fix high CPU usage on tier1 caused by a bad error handling
- Bump substreams to v1.9.2: Prevent Noop handler from sending outputs with 'Stalled' step in cursor (which breaks substreams-sink-kv)
- add
--reader-node-line-buffer-sizeflag and bump default value from 100M to 200M to go over crazy block 278208000 on Solana
- added well known type for starknet and cosmos
- Fixed a bug in substreams where chains with non-zero first-streamable-block would cause some substreams to hang. Solution changes the 'cached' hashes for those substreams.
- Fix a bug introduced in v1.6.0 that could result in corrupted store "state" file if all the "outputs" were already cached for a module in a given segment (rare occurence)
- We recommend clearing your substreams cache after this upgrade and re-processing or validating your data if you use stores.
- Expose a new intrinsic to modules:
skip_empty_output, which causes the module output to be skipped if it has zero bytes. (Watch out, a protobuf object with all its default values will have zero bytes) - Improve schedule order (faster time to first block) for substreams with multiple stages when starting mid-chain
- fix "hub" not recovering on certain disconnections in relayer/firehose/substreams (scenarios requiring full restart)
- Added substreams back-filler to populate cache for live requests when the blocks become final
- Fixed: truncate very long details on error messages to prevent them from disappearing when behind a (misbehaving) load-balancer
- Bootstrapping from live blocks improved for chains with very slow blocks or with very fast blocks (affects relayer, firehose and substreams tier1)
- Substreams fixed slow response close to HEAD in production-mode
- Substreams engine is now able run Rust code that depends on
solana_programin Solana land to decode andalloy/ether-rsin Ethereum land
Those libraries when used in a wasm32-unknown-unknown context creates in a bunch of wasmbindgen imports in the resulting Substreams Rust code, imports that led to runtime errors because Substreams engine didn't know about those special imports until today.
The Substreams engine is now able to "shims" those wasmbindgen imports enabling you to run code that depends libraries like solana_program and alloy/ether-rs which are known to pull those wasmbindgen imports. This is going to work as long as you do not actually call those special imports. Normal usage of those libraries don't accidentally call those methods normally. If they are called, the WASM module will fail at runtime and stall the Substreams module from going forward.
To enable this feature, you need to explicitly opt-in by appending a +wasm-bindgen-shims at the end of the binary's type in your Substreams manifest:
binaries:
default:
type: wasm/rust-v1
file: <some_file>to become
binaries:
default:
type: wasm/rust-v1+wasm-bindgen-shims
file: <some_file>-
Substreams clients now enable gzip compression over the network (already supported by servers).
-
Substreams binary type can now be optionally composed of runtime extensions by appending a
+<extension>,[<extesions...>]at the end of the binary type. Extensions arekey[=value]that are runtime specifics.[!NOTE] If you were a library author and parsing generic Substreams manifest(s), you will now need to handle that possibility in the binary type. If you were reading the field without any processing, you don't have to change nothing.
- Fix parsing of flag 'common-index-block-sizes' from yaml config file
- execout: preload only one file instead of two, log if undeleted caches found
- execout: add environment variable SUBSTREAMS_DISABLE_PRELOAD_EXEC_FILES to disable file preloading
- Revert sanity check to support the special case of a substreams with only 'params' as input. This allows a chain-agnostic event to be sent, along with the clock.
- Fix error handling when resolved start-block == stop-block and stop-block is defined as non-zero
Note Upgrading will require changing the tier1 and tier2 versions concurrently, as the internal protocol has changed.
- Index Modules and Block Filter now supported. See https://github.com/streamingfast/substreams-foundational-modules for an example implementation
- Various scheduling and performance improvements
- env variable
SUBSTREAMS_WORKERS_RAMPUP_TIMEchanged from4sto0. Set it to4sto keep previous behavior otelcol://tracing protocol no longer supported
- Allow stores to write to stores with out-of-order ordinals (they will be reordered at the end of the module execution for each block)
- Fix issue in substreams-tier2 causing some files to be written to the wrong place sometimes under load, resulting in some hanging requests
-
The
tools download-from-firehosenow respects it's documentation when doing--help, correct invocation now isfirecore tools download-from-firehose <endpoint> <start>:<end> <output_folder>. -
The
firecore tools download-from-firehosehas been improved to work with new Firehosesf.firehose.v2.BlockMetadatafield, if the server sends this new field, the tool is going to work on any chain. If the server's you are reaching is not recent enough, the tool fallbacks to the previous logic. All StreamingFast endpoints should serves be compatible. -
Firehose response (both single block and stream) now include the
sf.firehose.v2.BlockMetadatafield. This new field contains the chain agnostic fields we hold about any block of any chain.
- Fixed possible race condition in the blockPoller
- Fix relayer waiting too long to fail when reconnecting to a single source (especially on slow chains). It will now fail right away if it receives an unlinkable block and has a single source configured.
- Fixed skipped block handling and performance issues on blockPoller
- The
--block-typeflag got renamed to--substreams-tier1-block-type. Specifying it will make substreams-tier1 skip the block type discovery (from files or live stream) on startup, getting ready faster.
- Logs now print the "x-deployment-id" header on firehose connections (used to propagate subgraph deployment ids from graph-node and help debugging)
- bump substreams to v1.5.5 with fix in wazero to prevent process freezing on certain substreams
- bump go-generics to v3.4.0
- fix a possible panic() when an request is interrupted during the file loading phase of a squashing operation.
- fix a rare possibility of stalling if only some fullkv stores caches were deleted, but further segments were still present.
- fix stats counters for store operations time
- add
DefaultBlockTypeintofirehose.Chainstruct, enabling default block type setting for known chain
- bumped to v1.5.3
- add
--block-typeflag that can be specified when creating substreams tier1. If not specified, tier1 will auto-detect block type from source. - fix memory leak on substreams execution (by bumping wazero dependency)
- prevent substreams-tier1 stopping if blocktype auto-detection times out
- fix missing error handling when writing output data to files. This could result in tier1 request just "hanging" waiting for the file never produced by tier2.
- fix handling of dstore error in tier1 'execout walker' causing stalling issues on S3 or on unexpected storage errors
- increase number of retries on storage when writing states or execouts (5 -> 10)
- prevent slow squashing when loading each segment from full KV store (can happen when a stage contains multiple stores)
- Fix a context leak causing tier1 responses to slow down progressively
- fix another panic on substreams-tier2 service
- fix thread leak in metering affecting substreams
- revert a substreams scheduler optimisation that causes slow restarts when close to head
- add substreams_tier2_active_requests and substreams_tier2_request_counter prometheus metrics
- fix panic on substreams-tier2 service
- Substreams bumped to @v1.5.0: See https://github.com/streamingfast/substreams/releases/tag/v1.5.0 for details.
- A single substreams-tier2 instance can now serve requests for multiple chains or networks. All network-specific parameters are now passed from Tier1 to Tier2 in the internal ProcessRange request.
- This allows you to better use your computing resources by pooling all the networks together.
Important
Since the tier2 services will now get the network information from the tier1 request, you must make sure that the file paths and network addresses will be the same for both tiers.
ex: if --common-merged-blocks-store-url=/data/merged is set on tier1, make sure the merged blocks are also available from tier2 under the path /data/merged.
The flags --substreams-state-store-url, --substreams-state-store-default-tag and --common-merged-blocks-store-url are now ignored on tier2. The flag --common-first-streamable-block should be set to 0 to accommodate every chain.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files. The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.
- All module outputs are now cached. (previously, only the last module was cached, along with the "store snapshots", to allow parallel processing).
- Tier2 will now read back mapper outputs (if they exist) to prevent running them again. Additionally, it will not read back the full blocks if its inputs can be satisfied from existing cached mapper outputs.
- Tier2 will skip processing completely if it's processing the last stage and the
output_moduleis a mapper that has already been processed (ex: when multiple requests are indexing the same data at the same time) - Tier2 will skip processing completely if it's processing a stage where all the stores and outputs have been processed and cached.
- Scheduler modification: a stage now waits for the previous stage to have completed the same segment before running, to take advantage of the cached intermediate layers.
- Improved file listing performance for Google Storage backends by 25%
[!TIP]
- Concurrent requests on the same module hashes may benefit from the other requests' work to a certain extent (up to 75%) -- The very first request does most of the work for the other ones.
Tip
More caches will increase disk usage and there is no automatic removal of old module caches. The operator is responsible for deleting old module caches.
Tip
The cached 'partial' files no longer contain the "trace ID" in their filename, preventing accumulation of "unsquashed" partial store files.
The system will delete files under '{modulehash}/state' named in this format{blocknumber}-{blocknumber}.{hexadecimal}.partial.zst when it runs into them.
- Readiness metric for Substreams tier1 app is now named
substreams_tier1(was mistakenly calledfirehosebefore). - Added back readiness metric for Substreams tiere app (named
substreams_tier2). - Added metric
substreams_tier1_active_worker_requestswhich gives the number of active Substreams worker requests a tier1 app is currently doing against tier2 nodes. - Added metric
substreams_tier1_worker_request_counterwhich gives the total Substreams worker requests a tier1 app made against tier2 nodes.
-
Added
--merger-delete-threadsto customize the number of threads the merger will use to delete files. It's recommended to increase this when using Ceph as S3 storage provider to 25 or higher (due to performance issues with deletes the merger might otherwise not be able to delete one-block files fast enough). -
Added
--substreams-tier2-max-concurrent-requeststo limit the number of concurrent requests to the tier2 substreams service. -
If relayer is started with a single source, it will have reduced tolerance for missing blocks. This is to prevent the relayer from falling behind when the source is not producing blocks.
-
Fixed
tools check merged-blocksdefault range when-r <range>is not provided to now be[0, +∞](was previously[HEAD, +∞]). -
Fixed
tools check merged-blocksto be able to run without a block range provided. -
Added API Key based authentication to
tools firehose-clientandtools firehose-single-block-client, specify the value through environment variableFIREHOSE_API_KEY(you can use flag--api-key-env-varto change variable's name to something else thanFIREHOSE_API_KEY). -
Fixed
tools check merged-blocksexamples using block range (range should be specified as[<start>]?:[<end>]). -
Added
--substreams-tier2-max-concurrent-requeststo limit the number of concurrent requests to the tier2 Substreams service.
-
Added API Key authentication to
client.NewFirehoseFetchClientandclient.NewFirehoseClient.[!NOTE] If you were using
github.com/streamingfast/firehose-core/firehose/client.NewFirehoseFetchClientorgithub.com/streamingfast/firehose-core/firehose/client.NewFirehoseStreamClient, this will be a minor breaking change, refer to upgrade notes for details if it affects you.
- Performance: prevent reprocessing jobs when there is only a mapper in production mode and everything is already cached
- Performance: prevent "UpdateStats" from running too often and stalling other operations when running with a high parallel jobs count
- Performance: fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Added the output module's hash to the "incoming request" log
- The
reader-node-bootstrap-urlgained the ability to be bootstrapped from abashscript.
If the bootstrap URL is of the form bash:///<path/to/script>?<parameters>, the bash script at
<path/to/script> will be executed. The script is going to receive in environment variables the resolved
reader node variables in the form of READER_NODE_<VARIABLE_NAME>. The fully resolved node arguments
(from reader-node-arguments) are passed as args to the bash script. The query parameters accepted are:
-
arg=<value>| Pass as extra argument to the script, prepended to the list of resolved node arguments -
env=<key>%3d<value>| Pass as extra environment variable as<key>=<value>with key being upper-cased (multiple(s) allowed) -
env_<key>=<value>| Pass as extra environment variable as<key>=<value>with key being upper-cased (multiple(s) allowed) -
cwd=<path>| Change the working directory to<path>before running the script -
interpreter=<path>| Use<path>as the interpreter to run the script -
interpreter_arg=<arg>| Pass<interpreter_arg>as arguments to the interpreter before the script path (multiple(s) allowed)[!NOTE] The
bash:///script support is currently experimental and might change in upcoming releases, the behavior changes will be clearly documented here. -
The
reader-node-bootstrap-urlgained the ability to be bootstrapped from a pre-made archive file ending withtar.zstortar.zstd. -
The
reader-node-bootstrap-data-urlis now added automatically iffirecore.Chain#ReaderNodeBootstrapperFactoryisnon-nil.If the bootstrap URL ends with
tar.zstortar.zstd, the archive is read and extracted into thereader-node-data-dirlocation. The archive is expected to contain the full content of the 'reader-node-data-dir' and is expanded as is. -
Added
Beaconto known list of Block model.
- Fix marshalling of blocks to JSON in tools like
firehose-clientandprint merged-blocks
- Add missing metering events for
sf.firehose.v2.Fetch/Blockresponses. - Changed default polling interval in 'continuous authentication' from 10s to 60s, added 'interval' query param to URL.
- Fixed bug in scheduler ramp-up function sometimes waiting before raising the number of workers
- Fixed load-balancing from tier1 to tier2 when using dns:/// (round-robin policy was not set correctly)
- Added
trace_idin grpc authentication calls - Bumped connect-go library to new "connectrpc.com/connect" location
- Fixed
tools firehose-clientwhich was broken because of bad flag handling
- Added
--api-key-env-varflag to firehose-clients, which allows you to pass your API Key from an environment variable (HTTP headerx-api-key) instead of a JWT (Authorization: bearer), where supported.
-
Poller is now fetching blocks in an optimized way, it will fetch several blocks at once and then process them.
-
Poller is now handling skipped blocks, it will fetch the next blocks until it find a none skipped block.
-
Poller now has default retry value of infinite.
-
Compare tool is now using dynamic protobuf unmarshaler, it will be able to compare any block type.
-
Print tool is now using dynamic protobuf unmarshaler, it will be able to print any block type.
-
Print tool is encoding bytes in base64 by default, it can be changed to hex or base58 by using parameter
bytes-encoding. -
Added 'x-trace-id' header to auth requests when using --common-auth-plugin=grpc
-
Fixed Substreams scheduler sometimes taking a long time to spawn more than a single worker.
-
Added ACCEPT_SOLANA_LEGACY_BLOCK_FORMAT env var to allow special tweak operations
- Removed useless chainLatestFinalizeBlock from blockPoller initialization
- Added
Arweaveto known list of Block model.
- Added
FORCE_FINALITY_AFTER_BLOCKSenvironment variable to override block finality information at the reader/poller level. This allows an operator to pretend that finality is still progressing, N blocks behind HEAD, in the case where a beacon chain fails to do so and is intended as a workaround for deprecated chains like Goerli.
-
Updated
substreamsanddgrpcto latest versions to reduce logging. -
Tools printing Firehose
Blockmodel to JSON now have--proto-pathstake higher precedence over well-known types and even the chain itself, the order is--proto-paths>chain>well-known(sowell-knownis lookup last). -
The
tools print one-blocknow works correctly on blocks generated by omni-chainfirecorebinary. -
Tools printing Firehose
Blockmodel to JSON now have--proto-pathstake higher precedence over well-known types and even the chain itself, the order is--proto-paths>chain>well-known(sowell-knownis lookup last). -
The
tools print one-blocknow works correctly on blocks generated by omni-chainfirecorebinary. -
The various health endpoint now sets
Content-Type: application/jsonheader prior sending back their response to the client. -
The
firehose,substreams-tier1andsubstream-tier2health endpoint now respects thecommon-system-shutdown-signal-delayconfiguration value meaning that the health endpoint will returnfalsenow ifSIGINThas been received but we are still in the shutdown unready period defined by the config value. If you use some sort of load balancer, you should make sure they are configured to use the health endpoint and you shouldcommon-system-shutdown-signal-delayto something like15s. -
The
firecore.ConsoleReadergained the ability to print stats as it ingest blocks. -
The
firecore.ConsoleReaderhas been made stricter by ensuring Firehose chain exchange protocol is respected. -
Changed
readerlogger back toreader-nodeto fit with the app's name which isreader-node. -
Fix
-c ""not working properly when no arguments are present when invokingstartcommand. -
Fix
tools compare-blocksthat would fail on new format. -
Fix
substreamsto correctly delete.partialfiles when serving a request that is not on a boundary. -
Add Antelope types to the blockchain's known types.
This is a major release.
Important
When upgrading your stack to firehose-core v1.0.0, be sure to upgrade all components simultaneously because the block encapsulation format has changed. Blocks that are merged using the new merger will not be readable by previous versions.
-
New binary
firecorewhich can run all firehose components (reader,reader-stdin,merger,relayer,firehose,substreams-tier1|2) in a chain-agnostic way. This is not mandatory (it can still be used as a library) but strongly suggested when possible. -
Current Limitations on Ethereum:
- The firecore
firehoseapp does not support transforms (filters, header-only --for graph-node compatibility--) so you will want to continue running this app fromfireeth - The firecore
substreamsapps do not support eth_calls so you will want to continue running them fromfireeth - The firecore
readerdoes not support the block format output by the current geth firehose instrumentation, so you will want to continue running it fromfireeth
- The firecore
-
New BlockPoller library to facilitate the implementation of rpc-poller-based chains, taking care of managing reorgs
-
Considering that firehose-core is chain-agnostic, it's not aware of the different of the different block types. To be able to use tools around block decoding/printing, there are two ways to provide the type definition:
- the 'protoregistry' package contains well-known block type definitions (ethereum, near, solana, bitcoin...), you won't need to provide anything in those cases.
- for other types, you can provide additional protobuf files using
--proto-pathflag
- Merged blocks storage format has been changed. Current blocks will continue to be decoded, but new merged blocks will not be readable by previous software versions.
- The code from the following repositories have been merged into this repo. They will soon be archived.
- github.com/streamingfast/node-manager
- github.com/streamingfast/merger
- github.com/streamingfast/relayer
- github.com/streamingfast/firehose
- github.com/streamingfast/index-builder
- Fixed SF_TRACING feature (regression broke the ability to specify a tracing endpoint)
- Firehose connections rate-limiting will now force a delay of between 1 and 4 seconds (random value) before refusing a connection when under heavy load
- Fixed substreams GRPC/Connect error codes not propagating correctly
- fixed typo in
check-merged-blockspreventing its proper display of missing ranges
- Firehose logs now include auth information (userID, keyID, realIP) along with blocks + egress bytes sent.
- Filesource validation of block order in merged-blocks now works correctly when using indexes in firehose
Blocksqueries
- Flag
substreams-rpc-endpointsremoved, this was present by mistake and unused actually. - Flag
substreams-rpc-cache-store-urlremoved, this was present by mistake and unused actually. - Flag
substreams-rpc-cache-chunk-sizeremoved, this was present by mistake and unused actually.
Important
We have had reports of older versions of this software creating corrupted merged-blocks-files (with duplicate or extra out-of-bound blocks) This release adds additional validation of merged-blocks to prevent serving duplicate blocks from the firehose or substreams service. This may cause service outage if you have produced those blocks or downloaded them from another party who was affected by this bug.
- Find the affected files by running the following command (can be run multiple times in parallel, over smaller ranges)
tools check merged-blocks-batch <merged-blocks-store> <start> <stop>
- If you see any affected range, produce fixed merged-blocks files with the following command, on each range:
tools fix-bloated-merged-blocks <merged-blocks-store> <output-store> <start>:<stop>
- Copy the merged-blocks files created in output-store over to the your merged-blocks-store, replacing the corrupted files.
- Removed the
--dedupe-blocksflag ontools download-from-firehoseas it can create confusion and more issues.
- Bumped
bstream: thefilesourcewill now refuse to read blocks from a merged-files if they are not ordered or if there are any duplicate. - The command
tools download-from-firehosewill now fail if it is being served blocks "out of order", to prevent any corrupted merged-blocks from being created. - The command
tools print merged-blocksdid not print the whole merged-blocks file, the arguments were confusing: now it will parse <start_block> as a uint64. - The command
tools unmerge-blocksdid not cover the whole given range, now fixed
- Added the command
tools fix-bloated-merged-blocksto try to fix merged-blocks that contain duplicates and blocks outside of their range. - Command
tools print one-block and merged-blocksnow supports a new--output-formatjsonlformat. Bytes data can now printed as hex or base58 string instead of base64 string.
- Changed
tools check merged-blocks-batchargument syntax: the output-to-store is now optional.
- Fixed a few false positives on
tools check merged-blocks-batch - Fixed
tools print merged-blocksto print correctly a single block if specified.
-
Breaking The
reader-node-log-to-zapflag has been removed. This was a source of confusion for operators reporting Firehose on bugs because the node's logs where merged within normal Firehose on logs and it was not super obvious.Now, logs from the node will be printed to
stdoutunformatted exactly like presented by the chain. Filtering of such logs must now be delegated to the node's implementation and how it deals depends on the node's binary. Refer to it to determine how you can tweak the logging verbosity emitted by the node.
-
Added support
-o jsonlintools print merged-blocksandtools print one-block. -
Added support for block range in
tools print merged-blocks.[!NOTE] For now the range is restricted to a single "merged-blocks" file!
-
Added retry loop for merger when walking one block files. Some use-cases where the bundle reader was sending files too fast and the merger was not waiting to accumulate enough files to start bundling merged files
-
Added
--dedupe-blocksflag ontools download-from-firehoseto ensure no duplicate blocks end up in download merged-blocks (should not be needed in normal operations)
- Added
tools check merged-blocks-batchto simplify checking blocks continuity in batched mode, writing results to a store - Bumped substreams to
v1.1.20with a fix for some minor bug fixes related to start block processing
- Bumped
substreamstov1.1.18with a regression fix for when a Substreams has a start block in the reversible segment.
The --common-auth-plugin got back the ability to use secret://<expected_secret>?[user_id=<user_id>]&[api_key_id=<api_key_id>] in which case request are authenticated based on the Authorization: Bearer <actual_secret> and continue only if <actual_secret> == <expected_secret>.
- Bumped
substreamstov1.1.17with provider new metricssubstreams_active_requestsandsubstreams_counter
- Bumped
susbtreamstov1.1.14to fix bugs with start blocks, where Substreams would fail if the start block was before the first block of the chain, or if the start block was a block that was not yet produced by the chain. - Improved error message when referenced config file is not found, removed hard-coded mention of
fireacme.
- More tolerant retry/timeouts on filesource (prevent "Context Deadline Exceeded")
Important
The Substreams service exposed from this version will send progress messages that cannot be decoded by Substreams clients prior to v1.1.12. Streaming of the actual data will not be affected. Clients will need to be upgraded to properly decode the new progress messages.
- Bumped substreams to
v1.1.12to support the new progress message format. Progression now relates to stages instead of modules. You can get stage information using thesubstreams infocommand starting at versionv1.1.12. - Bumped supervisor buffer size to 100Mb
- Substreams bumped: better "Progress" messages
- Added new templating option to
reader-node-arguments, specifically{start-block-num}(maps to configuration valuereader-node-start-block-num) and{stop-block-num}(maps to value of configuration valuereader-node-stop-block-num)
- The
reader-nodeis now able to read Firehose node protocol line up to 100 MiB in raw size (previously the limit was 50 MiB).
- Removed
--substreams-tier1-request-statsand--substreams-tier1-request-stats(Substreams request-stats are now always sent to clients)
- Fixed bug where
nulldmetering plugin was not able to be registered.
- Fixed dmetering bug where events where dropped, when channel got saturated
fire{chain} tools check forksnow sorts forks by block number from ascending order (so that line you see is the current highest fork).fire{chain} tools check forks --after-blockcan now be used to show only forks after a certain block number.- bump
firehose,dmeteringandbstreamdependencies in order to get latest fixes to meter live blocks.
This release bumps Substreams to v1.1.10.
- Fixed jobs that would hang when flags
--substreams-state-bundle-sizeand--substreams-tier1-subrequests-sizehad different values. The latter flag has been completely removed, subrequests will be bound to the state bundle size.
- Added support for continuous authentication via the grpc auth plugin (allowing cutoff triggered by the auth system).
This release bumps Substreams to v1.1.9.
The substreams scheduler has been improved to reduce the number of required jobs for parallel processing. This affects backprocessing (preparing the states of modules up to a "start-block") and forward processing (preparing the states and the outputs to speed up streaming in production-mode).
Jobs on tier2 workers are now divided in "stages", each stage generating the partial states for all the modules that have the same dependencies. A substreams that has a single store won't be affected, but one that has 3 top-level stores, which used to run 3 jobs for every segment now only runs a single job per segment to get all the states ready.
The substreams server now accepts X-Sf-Substreams-Cache-Tag header to select which Substreams state store URL should be used by the request. When performing a Substreams request, the servers will optionally pick the state store based on the header. This enable consumers to stay on the same cache version when the operators needs to bump the data version (reasons for this could be a bug in Substreams software that caused some cached data to be corrupted on invalid).
To benefit from this, operators that have a version currently in their state store URL should move the version part from --substreams-state-store-url to the new flag --substreams-state-store-default-tag. For example if today you have in your config:
start:
...
flags:
substreams-state-store-url: /<some>/<path>/v3You should convert to:
start:
...
flags:
substreams-state-store-url: /<some>/<path>
substreams-state-store-default-tag: v3The app substreams-tier1 and substreams-tier2 should be upgraded concurrently. Some calls will fail while versions are misaligned.
-
Authentication plugin
trustcan now specify an exclusive list ofallowedheaders (all lowercase), ex:trust://?allowed=x-sf-user-id,x-sf-api-key-id,x-real-ip,x-sf-substreams-cache-tag -
The
tier2app no longer uses thecommon-auth-plugin,trustwill always be used, so thattier1can pass down its headers (ex:X-Sf-Substreams-Cache-Tag).
-
Added
fire{chain} tools check forks <forked-blocks-store-url> [--min-depth=<depth>]that reads forked blocks you have and prints resolved longest forks you have seen. The command works for any chain, here a sample output:... Fork Depth 3 #45236230 [ea33194e0a9bb1d8 <= 164aa1b9c8a02af0 (on chain)] #45236231 [f7d2dc3fbdd0699c <= ea33194e0a9bb1d8] #45236232 [ed588cca9b1db391 <= f7d2dc3fbdd0699c] Fork Depth 2 #45236023 [b6b1c68c30b61166 <= 60083a796a079409 (on chain)] #45236024 [6d64aec1aece4a43 <= b6b1c68c30b61166] ... -
The
fire{chain} toolscommands and sub-commands have better rendering--helpby hidden not needed global flags with long description.
-
Added missing
--substreams-tier2-request-statsrequest debugging flag. -
Added missing Firehose rate limiting options flags,
--firehose-rate-limit-bucket-sizeand--firehose-rate-limit-bucket-fill-rateto manage concurrent connection attempts to Firehose, checkfire{chain} start --helpfor details.
- Fixed Substreams accepted block which was not working properly.