☁️

Cloud Cost Reduction Open Access

The purpose of this document is to track daily progress and updates of the cloud cost reduction plan.

Jan 19, 2023 (Thu)

Majority of Harmony services were originally hosted on AWS. Despite AWS’s usefulness in metrics, monitoring, and service variety, the cost on the platform seemed unreasonable. The Devops team is taking a measure to migrate services away from AWS to other providers (Digital Ocean, Hetzner, Latitude, etc.) with reasonable pricing. Plans and actions other than migration is also being laid out and carefully performed. The major plans for Q1 can be listed as the following:

Q1 Goals:

  • Shutdown internal validator nodes (20 nodes)
  • Reduce cost on services

Shutdown Internal Validator Nodes

Harmony has 20 internal validators (4 shards and 5 validators per shard). As of Jan 19, 16 of the 20 nodes have been migrated away from AWS to Hetzner and Latitude. Currently, the cost breakdown for solely validator nodes are as following:

image

Total cost for the 4 validators in AWS outcosts the 16 nodes in Hetzner and Latitude combined. The Devops team are currently in the process of migrating validators of S1, S2, and S3 to Digital Ocean. Once these 3 validators are completely migrated, the remaining validator for S0 will also be migrated. The expected overall cost after the complete migration is $1779 / month ($21,348 / year). We will be cutting approximately $3414 / month ($40,968 / year).

Once the validators are completely migrated, Harmony will move forward on shutting all 20 internal validators. This will require hardforks related to leader rotation, leader role to external validator, internal node slots, and voting power reduction. Due to the complexity of this process, shutdown will be carefully but deliberately performed. Shutdown of the internal validators will not only cut the cost even further, but will fully decentralize the validators.

Reduce Cost on Services

image

The Devops team was able to cut the cloud cost from approximately $120,000 to $95,000 during December 2022. We have performed migrations and scaling down of less utilized services without impacting the Harmony network. With further work, we will be able to cut the cost to approximately $65,000 in Jan 2023.

Even though this is an improvement, we have more room to cut cost. Our current projected cost for Jun / Jul 2023 is less than $15,000. Once our subscriptions for AWS and other providers end, the goal of $15,000 will be achievable.

Conclusion

The details and status of the cost cutting process will be further documented.

Jan 24, 2023 (Tues)

As of Jan 23, 2023, all 4 internal validators from AWS - Main have been migrated to Hetzner (S0) and Digital Ocean (S1, S2, S3).

The cost of hosting 4 validators in AWS was approximately ~$3,900/month. Now that the validators have migrated away, the price of the 3 validators in Digital Ocean is $168/month ($2,016/year) (cost for S0 in Hetzner will be updated).

Our next step is shutting down all internal validators in order to fully decentralized as well as to cut further cost. You might wonder why the internal validators are not being shut down right away? Or you might have other series of questions regarding Harmony’s cloud architecture. Below are a series of questions and answers that can give insight to our current status.

  1. Why are the internal validators being migrated in the first place? Why are they not being shutdown right away? Currently, the 20 internal validators are the only nodes that are serving as leaders in the Harmony network. As leaders, they propose blocks and continue the chain. If we were to terminate all the validators right away, there would not be any leaders to propose blocks. Thus, the network would not progress at all. In order to safely terminate all 20 validators without disrupting the network, external validators need to be included as potential leaders. This process will require hardforks regarding external leader rotation and slot reduction. We are currently aiming for late Q1 or early Q2 for these changes to take place. Once these hardforks are in place in the network, the internal validators will be fully terminated.
  2. Why are we transitioning back from ERPC nodes to legacy RPC nodes?
  3. To put it simple, ERPC nodes are costlier than RPC nodes. It is true ERPC nodes have the ability to quickly scale. However, more resources such as DB nodes, Redis cluster, reader / writer component, etc. are required for ERPC to fully function. The entire architecture has a base cost that is costlier than RPC nodes. All of the mentioned components are embedded into RPC nodes. [Since we are not expecting high growth during the crypto winter, we want to transition into legacy RPC to save cost.]

  4. When will the ERPC nodes fully terminate? Last year, when Harmony transitioned from RPC to ERPC to handle large volume of traffic, the RPC nodes have been fully dismantled (cutting cost from 500k to 150k monthly). Now that we are transitioning back, series of testing and configuration is required for us to spin back up the RPC nodes. Once the full RPC nodes and the full archival RPC nodes are up, we can safely terminate ERPC nodes.
  5. Why keep 4 archival RPC nodes? Why the redundancy?
  6. Each archival node holds 22TB of data. If we were to have a hardware failure and there was not redundancy (only a single node), resyncing the node from nothing will take us a full year. You might ask, why not keep all that data in a storage (i.e. S3 or Storj)? Due to the massive amount of data, it would be too expensive to hold all that data in a storage. Hence, we keep the data in a node. Syncing through copying a server takes around 2 weeks, which is reasonable compared to the full sync of 1 year.

  7. What are light nodes?
  8. They are light-weight RPC nodes that are able to handle simpler transactions. Since they do not need to download and store the whole blockchain, they are a lot cheaper and easier to spin up. Many of Harmony’s transactions are processed through Metamask. Light nodes can fully handle transactions in Metamask, making a full RPC node for Metamask an overkill. Army of light nodes with backend, full RPC nodes will eventually help scale Harmony’s RPC network. For full detail, checkout our medium article!

  9. Why is the projected cost of $15k planned for Jun 2023 and not earlier?
  10. We bought Reserved Instances last year in order to save cost for EC2 instances. We had larger traffic, thus purchasing the plan was a reasonable option. However, now that we are handling less traffic, we do not require instances of the volume we hafe purchased up front. The payment will last throughout Jun 2023 and it only makes sense to utilize those instances. The monthly cost for the RIs are costly and only when we are done paying for them, we will be able to cut cost to $15k.

Please ask me with any other questions or concerns, and I’ll happily answer them in this document! 🙂