January 9, 2023 by Drew DeVault

SourceHut will (not) blacklist the Go module mirror

Update 2023-01-31: Russ Cox of the Go team reached out to us to address this problem. After some discussion, an acceptable plan was worked out. The Go team is working on deploying an update to the “go” tool to add a -reuse flag, which should substantially reduce the traffic generated by this system for all users of Go.

In the meantime, the automated refresh traffic from proxy.golang.org was disabled for SourceHut, which the Go team assures us should have little-to-no impact on users and which reduces the burden on our system to a managable level. Following this change by the Go team, we have observed traffic from the Go module mirror reduced to an acceptable level. The Go team has decided that the automatic refresh behavior is their responsibility, not the responsibility of other operators, so any other small hosts will hopefully not be affected as the Go team will enable or disable the refresh behavior at their discretion with the burden on third-party operators in mind.

Consequently, we have cancelled our plans to disable Go traffic to git.sr.ht. No action is required by users to continue receiving service. Thanks Russ!

The original post can be read below.


SourceHut will disable git access for the Go Module Mirror on February 24th, 2023. This will cause a service impact for Go users. This article explains why this step is necessary and how Go users can work around the issue.

tl;dr

From February 24th, users running go get or a similar command on Go packages which import modules from SourceHut repositories will be met with an error message similar to the following:

$ go get
go: downloading git.sr.ht/~sircmpwn/foobaz v0.0.0-20230108094957-81402546c10e
go: git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: verifying module: git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: reading https://sum.golang.org/lookup/git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: 404 Not Found
	server response:
	not found: git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: invalid version: git ls-remote -q origin in /tmp/gopath/pkg/mod/cache/vcs/568e5edafe93f7887c0b6f718b0f17ea91c63c35822fb28628535f172b5429b7: exit status 128:
		fatal: unable to access 'https://git.sr.ht/~sircmpwn/foobaz/': The requested URL returned error: 429

The following workaround will correct the issue:

$ export GOPRIVATE=git.sr.ht
$ go get # works

For more detail, read on.

Background

The Go programming language fetches modules via git, such that a user who imports “git.sr.ht/~sircmpwn/dowork” will cause the toolchain to fetch the corresponding repository via git in order to make it available in the user’s Go environment. Each of these requests is routed through a proxy service at proxy.golang.org, which provides a number of features:

This comes with a number of downsides. For example, most Go users are unaware that every package they fetch is accompanied by a request to Google’s servers, and implies a trust relationship with Google to return authentic packages. Additionally, should the underlying source repository disappear or become out of sync with the cache, the problem is hidden from Go users, which can cause their software to become dependent on modules or module versions which no longer exist.

More importantly for SourceHut, the proxy will regularly fetch Go packages from their source repository to check for updates – independent of any user requests, such as running go get. These requests take the form of a complete git clone of the source repository, which is the most expensive kind of request for git.sr.ht to service. Additionally, these requests originate from many servers which do not coordinate with each other to reduce their workload. The frequency of these requests can be as high as ~2,500 per hour, often batched with up to a dozen clones at once, and are generally highly redundant: a single git repository can be fetched over 100 times per hour.

This traffic produces an excessive background workload which is constantly being serviced by git.sr.ht. Due to the relatively large traffic requirements of git clones, this represents about 70% of all outgoing network traffic from git.sr.ht. A single module can produce as much as 4 GiB of daily traffic from Google.

Seeking other solutions

Blacklisting GoModuleMirror is our last resort, and we wanted to avoid it if possible. We attempted to work with the Go team on a solution, but were unsuccessful.

On February 24th, 2021, we reported an issue to the Go team regarding this problem. The Go team initially helped us narrow down the cause, first by setting an appropriate User-Agent to help identify this traffic, then through discussions regarding the behavior of this system. We made recommendations to Google for how to service their requirements without generating an excessive amount of redundant traffic. However, the discussion stalled and no further changes were made by Google to address the issue, and we continued to receive an excessive amount of traffic from the module mirror.

The situation remained so for over a year. In that time, I was banned from the Go issue tracker without explanation, and was unable to continue discussing the problem with Google. With few options left, I wrote a blog post on May 25th, 2022 outlining the issue and petitioning the public for support in addressing this problem. However, the problem remains unsolved.

February 24th, 2023, the date that we plan to disable Go traffic, marks two years since the initial complaint was submitted to the Go team. The cost of bearing this traffic is no longer acceptable to us, and the Go team has made no attempts to correct the issue during this time. We want to avoid causing inconvenience for Go users, but the load and cost is too high for us to continue justifying support for this feature.

Recommendations for the Go team

From February 24th, git clone requests with a GoModuleMirror User-Agent will receive a 429 (Too Many Requests) response. To restore service, we have the following recommendations for the Go team:

  1. Obey robots.txt, including the Crawl-Delay directive, to control the rate at which modules are fetched.
  2. Perform a shallow git clone rather than a full git clone; or, ideally, store the last seen commit hash for each reference and only fetch if it has been updated.
  3. Reduce redundant traffic: fetch each git repository less often. It should not be necessary to fetch the same git module up to 2,000 times per day.

If these issues are addressed, we would be pleased to re-enable GoModuleMirror’s access to our services. The Go team can reach me via email to discuss the matter further, if you have ideas for other solutions or require any additional details to address the problem.

Recommendations for Go users

Unfortunately, this will affect Go users and will require workarounds to be implemented in order to compile Go software which depend on modules hosted on git.sr.ht. If you encounter an error similar to the following:

$ go get
go: downloading git.sr.ht/~sircmpwn/foobaz v0.0.0-20230108094957-81402546c10e
go: git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: verifying module: git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: reading https://sum.golang.org/lookup/git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: 404 Not Found
	server response:
	not found: git.sr.ht/~sircmpwn/foobaz@v0.0.0-20230108094957-81402546c10e: invalid version: git ls-remote -q origin in /tmp/gopath/pkg/mod/cache/vcs/568e5edafe93f7887c0b6f718b0f17ea91c63c35822fb28628535f172b5429b7: exit status 128:
		fatal: unable to access 'https://git.sr.ht/~sircmpwn/foobaz/': The requested URL returned error: 429

You can correct it by bypassing proxy.golang.org:

$ export GOPRIVATE=git.sr.ht
$ go get # works

We will enable these variables by default for builds.sr.ht jobs prior to disabling service for the Go module mirror.

We understand if Go users need to migrate away from SourceHut to deal with these problems. We apologise for this inconvenience, and we hope to see you return should the problem be resolved. The SourceHut team is available to assist with any necessary migration efforts via sr.ht-discuss.