The 502
Pasting an image into the terminal failed with a 502 in our US deployments. The upload path was a multipart proxy: the browser POSTed the file to the backend, the backend streamed it to S3, then returned a presigned URL for the stored object. The backend has a 15-second HTTP timeout. For a US client uploading through a backend that then talks to S3, the proxy-to-S3 round trip alone could exceed that budget, and the request was cut off before the object finished landing.
The backend was doing nothing with the bytes except forwarding them. It buffered the multipart body, opened the file, and called storage.Upload(...) — pure pass-through, but pass-through that counted against the request timeout and consumed backend memory and bandwidth per upload.
Why proxying was the wrong place for the data
Two costs stacked up in the proxy design:
- Latency budget. The client-visible request stayed open for the entire client→backend→S3 transfer. Cross-region, that competes with a fixed 15s server timeout that exists for good reasons elsewhere. Raising it globally to accommodate uploads is the wrong lever.
- The backend on the data path. Every byte transited the backend process. It added no value — no transformation, no validation that needed the body, no record anyone read.
That last point turned out to matter. The old flow also wrote a file DB row (uploader, original name, storage key, mime, size) and exposed GetByID, GetURL(id), and DELETE /files/:id. Grepping for consumers found none: nothing read the row back, nothing called delete. The metadata was being persisted on the assumption it would be useful, not because anything depended on it.
Presigned PUT + GET
The replacement (PR #131) keeps the backend in the control path and takes it out of the data path. The browser asks the backend for two presigned URLs, then talks to S3 directly:
POST /api/v1/orgs/:slug/files/presign
{ "filename": "...", "content_type": "image/png", "size": 12345 }
→ { "put_url": "...", "get_url": "..." }
The backend validates, signs, and returns. The service body is small:
func (s *Service) RequestPresignedUpload(ctx context.Context, req *PresignUploadRequest) (*PresignUploadResponse, error) {
maxSize := s.config.MaxFileSize * 1024 * 1024
if req.Size > maxSize {
return nil, fmt.Errorf("%w: max size is %d MB", ErrFileTooLarge, s.config.MaxFileSize)
}
if !s.isAllowedType(req.ContentType) {
return nil, fmt.Errorf("%w: %s", ErrInvalidFileType, req.ContentType)
}
storageKey := s.generateStorageKey(req.OrganizationID, req.FileName)
putURL, err := s.storage.PresignPutURL(ctx, storageKey, req.ContentType, 15*time.Minute)
// ...
getURL, err := s.storage.GetURL(ctx, storageKey, 24*time.Hour)
// ...
}
The client is a three-step sequence — presign, PUT to S3, return the GET URL:
const { put_url, get_url } = await (await fetch(`${API_BASE_URL}/.../files/presign`, {
method: "POST",
headers: { "Content-Type": "application/json", Authorization: `Bearer ${token}` },
body: JSON.stringify({ filename: file.name, content_type: file.type, size: file.size }),
})).json();
await fetch(put_url, { method: "PUT", headers: { "Content-Type": file.type }, body: file });
return get_url; // valid 24h
The backend no longer touches the body, so its timeout no longer bounds the transfer. The transfer is now between the browser and S3, where it belongs.
Things that don't move to the client
Direct-to-S3 does not mean trust-the-client. Three constraints stay server-side because the presign request is the only point the backend still controls:
- Size and MIME validation run before signing. The size limit is advisory — S3 enforces nothing about the body matching the declared
size— but the MIME check is meaningful becauseContent-Typeis baked into the signature. - The storage key is generated, never accepted. Format:
orgs/{org_id}/files/{year}/{month}/{uuid}{ext}. The client never names the object, so it can't overwrite another org's key or escape the tenant prefix. - Expiry is asymmetric and deliberate. PUT is signed for 15 minutes — long enough for one upload, short enough that a leaked URL ages out fast. GET is 24 hours, sized to how long a pasted image link needs to resolve.
Content-Type is the subtle one. PresignPutObject binds it into the SigV4 signature, so the Content-Type the client sends on the PUT must equal the one it declared on /presign, or S3 rejects the signature. The contract between the two requests is enforced by S3, not by us.
Two endpoints, one signature gotcha
S3 presigning signs the host. We presign against the public endpoint when the browser is the client and the internal endpoint for service-to-service uploads (the Runner uploads diagnostic logs the same way, over the Docker network), because a URL signed for one host 403s when sent to the other:
// When a public endpoint differs from the internal one, use a presign
// client bound to the public host so the SigV4 signature matches the
// host the client will actually contact.
presigner := s.presign
if s.publicPresign != nil {
presigner = s.publicPresign
}
Browser uploads also need CORS on the bucket. With the backend out of the path, the PUT is a cross-origin request straight to S3/MinIO, so MINIO_API_CORS_ALLOW_ORIGIN had to be added to every docker-compose. A later pass found the matching read-side bug: GetURL was returning an unsigned public URL that 403'd on default-private MinIO buckets, so it now routes through the public presign client too.
Result
The change deleted more than it added: 211 insertions, 1102 deletions across 24 files. Gone are the multipart handler, the file DB record and its repository, the domain/file package, and the DELETE /files/:id endpoint. What remains is one validate-and-sign endpoint and a storage interface with PresignPutURL alongside the existing GetURL.
The transferable rule: if a service forwards bytes it neither inspects nor stores, the bytes shouldn't go through it. Presigned URLs let the backend stay the policy decision point — auth, key namespacing, type and expiry — while the data takes the direct path. Keep validation, key generation, and expiry on the server; let Content-Type and the signed host be the parts S3 enforces for you.