RPC optimization - getLogs

Next Milestone Date
February 14, 2022
Assign
Jenya PiskunovJack Chan
Status
Live on Devnet
Github
Collaborator

Grant-1

Grant-2

Grant-3

Release Date

2022 Q2

Goals

The goal of this project is to build a separate endpoint to serve the heavy RPC calls like eth_getLogs calls.

Context

eth_getLogs call is a very heavy call as it may go through all the blockchain blocks from genesis to the latest blocks to find the transaction data. The current implementation of getLogs RPC handler in Harmony wasn’t optimized to serve thousands of requests as it was only designed to handle a single user’s request.

Based on the shared experience of Infura, the getLogs RPC request could serve users better using a separate indexed DB, such like postgres.

Architecture

From the public end point, the payload needs to be parsed to identify the eth_getLogs call and redirect the request to the specific load balancer to serve the requests. The logLB is using the reader/writer architecture to serve scalable requests.

Additional indexer has to be built to index the receipt and put into the indexed postgres DB. This is the writer of the DB.

Scalable readers can read from the DB and serve the RPC requests. This is the reader of the DB.

Action Items

look into existing indexed DB, prototype RPC handler to serve getLogs call (@Jenya Piskunov)
benchmarking the performance of getLogs on indexed DB vs existing code (@Jenya Piskunov)
investigate cloudflare and aws on jsonrpc parsing (@Jack Chan)
architecture review

Reference

https://blog.infura.io/ethereum-rpcs-methods/

Infura has a cap on requests of 10,000 events per query.
Best practice is to request a single block, as we’ve done in this example,
and do that for each mined block.

https://blog.infura.io/building-better-ethereum-infrastructure-48e76c94724b/?&utm_source=infurablog&utm_medium=referral&utm_campaign=tutorials&utm_content=eth_call_tutorial

RPC ethLogs PR Embed GitHubEmbed GitHub

currently 2 instances behind lb (rpcapi1.explorer.t.hmny.io and rpcapi2.explorer.t.hmny.io)

curl --location --request POST 'https://indexed-db-api.hmny.io/v0/rpc' \ --header 'Content-Type: application/json' \ --data-raw '{ "jsonrpc": "2.0", "method": "eth_getLogs", "id": 1, "params": [{"fromBlock": "0x1340423", "toBlock": "0x1340423"}] }'

benchmarks one by one queries (no load) show the same response speed

200 generated requests (same set used for node and postgres) NODE internal.s0.t.hmny.io Total time: 231577 ms POSTGRES EXPLORER RPC Total time: 218401 ms

Explore benchmarking using Infura’s Versus tool:

https://blog.infura.io/compare-ethereum-api-performance-with-versus/

Todo

  • RO replicate to read by the RPC nodes handled by RDS / postgres
    • any configuration (@Jack Chan)
  • API gateway (@Nita Neou (Soph) )
    • search for getLogs in the Payload, faster for the lambda
    • monitor the performance of the parser
    • dedicated getlog endpoint behind ELB apilog.s0.t.hmny.io (internal only)
  • monitoring and scaling
    • auto scaling support
  • HA indexer for the explorer Db (@Jenya Piskunov)
    • instability of the indexer
    • dedicated indexer for getLogs
    • new DB to save the logs
  • a new separate endpoint for the new getLogs service by @February 18, 2022 (@Nita Neou (Soph))
    • setup a new endpoint https://apitest.s0.t.hmny.io
    • API getway setup with Lambda function to search for getLog json rpc
    • divert getLog requests to the specific endpoint apilog.s0.t.hmny.io
    • divert regular RPC requests to a few standard api
    • monitor the performance of apilog
    • advertise it internally to a few validators and other dApp