One of the large applications I was working on had the same issue, to solve it we ended up creating multiple smaller instances and started hosting a set of related API’s in each server.
for example read operations like list posts, comments etc could be in one server. write operations can be clusered in one server.
Later, whichever server is getting overloaded can be split up again. In our case 20% of API’s used around 3/4th of server resources, so we split those 20% API’s in 4 large servers and kept the remaining 80% API’s in 3 small servers.
This worked for us because the DB’s were maintained in seperate servers.
I wonder if a quasi micro-services approach will solve the issue here.
Edit 1: If done properly this approach can be cost effective, in some cases it might cost 10 to 20 percentage more in server costs, however it will lead to a visible improvement in performance.
Yeah, Star Trek is an example of HFY. Star Trek fan five are welcome too.