The largest table holds data that is only needed by Lemmy briefly. There is a scheduled job to clear it... Every 6 months. There are active discussions on how best to handle this.
On my instance I've set a cronjob to delete everything but the most recent 100k rows of that table every hour.
On the point of being unsustainable I disagree. Instances will need to find an equilibrium between cost/expense and retention of old content. The higher the revenue/cost tolerance, the older the content that can be retained. I expect most instances will end up purging non-local content after an amount of time, but retain local content as long as possible. Maybe I'm naive, but I have confidence that people smarter than me will come up with systems to do this. It may result in a usenet style setup where instances boast about their retention periods.
On your second point re: community contributions, I agree entirely. I've been very fortunate that there have been some generous donations from aussie.zone users, so I'm not worried about server costs at this point. Server costs will go up as data volumes increase, that is unavoidable. How the community decides to handle this in the future is the real question, based on what I've experienced so far I'm confident we'll be around for a long time to come.
Hmm sorry I can't help. Though this sort of question may see a better audience over at Whirlpool. Much higher density of nerds over there π€