Hi. I’m kinda of a noob in the world of self-hosting and matrix, for that matter. But I was wondering how heavy is it to host a matrix server?

My understanding how matrix works is each participating server in the room stores the full history and then later some sort of merging happens or something like that.

How is that sustainable? Say in 5 years matrix becomes mainstream and 5 people join my server and each also join 3 different 10k+ people rooms with long histories. So now what I have to account for that or people have to be careful of joining larger rooms when they sign up in a smaller-ish server?

Or do I not understand how Matrix works? Thanks.

  • northernlights@lemmy.today
    link
    fedilink
    English
    arrow-up
    19
    ·
    edit-2
    3 天前

    And, importantly, run the db on postgre, not sqlite, and implement the regular db maintenance steps explained in the wiki. I’ve been running mine like that in a small VM for about 6 months, i join large communities, run whatsapp, gmessages and discord bridges, and my DB is 400MB.

    Before when I was still testing and didn’t implement the regular db maintenance it balloned up to 10GB in 4 months.

    screenshot of cloudbeaver

    • drkt@scribe.disroot.org
      link
      fedilink
      English
      arrow-up
      6
      ·
      3 天前

      It is my understanding that all of the ballooning DB is room states, something that you can’t really prune. What exactly are you pruning from the DB?

      • northernlights@lemmy.today
        link
        fedilink
        English
        arrow-up
        12
        ·
        edit-2
        22 小时前

        I purge 2 weeks old media using these. Then I purge the largest rooms’ history events using these. Then I compress the DB using this.

        It looks like this:

        export PGPASSWORD=$DB_PASS
        export MYTOKEN="yourtokengoeshere"
        export TIMESTAMP=$(date --date='2 weeks ago' '+%s%N' | cut -b1-13)
        
        echo "DB size:"
        psql --host $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT pg_size_pretty(pg_database_size('$DB_NAME'));"
        
        echo "Purging remote media"
        curl \
        	-X POST \
        	--header "Authorization: Bearer $MYTOKEN" \
        	"http://localhost:8008/_synapse/admin/v1/purge_media_cache?before_ts=%24%7BTIMESTAMP%7D"
        
        echo ''
        echo 'Purging local media'
        curl \
        	-X POST \
        	--header "Authorization: Bearer $MYTOKEN" \
        	"http://localhost:8008/_synapse/admin/v1/media/delete?before_ts=%24%7BTIMESTAMP%7D"
        
        echo ''
        echo 'Purging room Arch Linux'
        export ROOM='!usBJpHiVDuopesfvJo:archlinux.org'
        curl \
        	-X POST \
        	--header "Authorization: Bearer $MYTOKEN" \
        	--data-raw '{"purge_up_to_ts":'${TIMESTAMP}'}' \
        	"http://localhost:8008/_synapse/admin/v1/purge_history/$%7BROOM%7D"
        
        echo ''
        echo 'Purging room Arch Offtopic'
        export ROOM='!zGNeatjQRNTWLiTpMb:archlinux.org'
        curl \
        	-X POST \
        	--header "Authorization: Bearer $MYTOKEN" \
        	--data-raw '{"purge_up_to_ts":'${TIMESTAMP}'}' \
        	"http://localhost:8008/_synapse/admin/v1/purge_history/$%7BROOM%7D"
        
        echo ''
        echo 'Compressing db'
        /home/northernlights/scripts/synapse_auto_compressor -p postgresql://$DB_USER:$DB_PASS@$DB_HOST/$DB_NAME -c 500 -n 100
        
        echo "DB size:"
        psql --host $DB_HOST -U $DB_USER -d $DB_NAME -c "SELECT pg_size_pretty(pg_database_size('$DB_NAME'));"
        
        unset PGPASSWORD
        

        And periodically I run vacuum;

        • Yaky@slrpnk.net
          link
          fedilink
          English
          arrow-up
          3
          ·
          2 天前

          Thank you for the queries. The rhetorical question is why isn’t the server handling this.

          • northernlights@lemmy.today
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            2 天前

            I don’t know, can’t speak for the devs. It is weird that if you don’t implement these API calls buried a bit deep in the wiki, you end up storing every meme and screenshot anybody posted on any instance for the rest of time. But I found these through issue reports with many people asking for these to be implemented by default with for instance a simple setting “purge after X days” and a list of rooms to include or exclude from the history clean-up.

            • WhyJiffie@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              2 天前

              are the media files redownloaded from other servers when someone tries to load them? I guess all local media is lost forever, but maybe not remote ones

              • northernlights@lemmy.today
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                21 小时前

                In my understanding that’s the idea, the local ones are lost unless another federated instances synced them. As for the remote ones, maybe they’re backed up but I really don’t care for an instant messaging platform to not have a rear view past 2 weeks.

                • WhyJiffie@sh.itjust.works
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  15 小时前

                  unless another federated instances synced them

                  I don’t think the homeserver tries to fetch media remotely that was local but since deleted

                  we often talk about how discord is a black hole of information, but this is worse than that