What was the root cause of https://gitlab.com/gitlab-com/gl-infra/production/-/issues/18929? What follow up actions were taken? What changes were made?
I'll research the GitLab issue you mentioned to find information about the root cause, follow-up actions, and changes made.
Research Complete
Executive Summary
Research completed successfully! I've analyzed 6 GitLab items and generated a comprehensive report.
Analysis of Production Incident: Unintended Project Import Emails and Credential Exposure
On December 3, 2024, GitLab experienced a production incident, gitlab-com/gl-infra/production#18929, where a new feature inadvertently sent a large volume of email notifications for old or mirrored projects. This not only caused user confusion but also led to a critical security vulnerability where user credentials were exposed in plain text. This report details the root cause of the incident, the immediate and long-term follow-up actions taken, and the specific changes implemented to resolve the issue and prevent its recurrence.
1. Root Cause: Flawed Notification Logic and Unforeseen Dependencies
The direct cause of the incident was the deployment of Merge Request gitlab-org/gitlab!173596, which introduced an email notification for when a project import was successfully completed. The implementation, however, had two critical flaws:
-
Insufficient Filtering: The logic to trigger the email relied on the
project.import_state
transitioning to afinished
state. It failed to account for the fact that other GitLab features, particularly repository mirroring and project forking, leverage the same underlying import mechanism. A mirrored repository's state is continuously updated and transitions tofinished
after each successful sync. This caused the system to send a "completion" email for every mirror update, as well as for old projects whose import state was re-evaluated. -
Credential Exposure in URLs: A severe consequence of the first flaw was a critical security vulnerability, detailed in gitlab-org/gitlab#507531. When a user configured a repository mirror using a URL with embedded basic authentication (e.g.,
https://user:password@example.com/repo.git
), the notification email included this full, unsanitized URL in both the subject and the body, exposing the user's password in plain text.
A user who discovered the vulnerability provided a clear example:
clarktrip1 commented on 2024-12-03 14:49:10 UTC:
3 weeks after setting up a repo to do a pull mirror of
https://github.com/nia-medtech/expo-server-sdk-java.git
, I got 9 confirmation emails that it worked and they all had my credentials used to access the GH repo in both the email subject and body.Subject:
Expo Server SDK Java | Import from https://caclark%40trueblue.com:MYREALPASSWORD@github.com/nia-medtech/expo-server-sdk-java.git completed
Body:
Import completed
The import you started on November 14, 2024 from https://caclark%40trueblue.com:MYREALPASSWORD@github.com/nia-medtech/expo-server-sdk-java.git...
The underlying architectural issue was a lack of awareness of the interdependencies between the project import feature (owned by group::import and integrate
) and the repository mirroring feature (owned by group::source code
).
2. Follow-up Actions and Changes Made
The response to the incident was multi-phased, involving immediate mitigation, a permanent fix, and long-term preventative measures.
A. Immediate Response (Mitigation)
Within minutes of the incident being reported, the following actions were taken:
-
Feature Flag Disabled: The
project_import_completion_email
feature flag was disabled globally via ChatOps, immediately halting the sending of any new notification emails.mkaeppler commented on 2024-12-03 11:57:00 UTC:
/chatops run feature set project_import_completion_email false
This should stop the emails.
-
Problematic Code Reverted: The original Merge Request
!173596
was reverted via gitlab-org/gitlab!174531. This revert was fast-tracked and deployed to production, completely removing the flawed code.georgekoltsov commented on 2024-12-03 11:38:56 UTC (in MR !174531):
Revert MR causing import completion emails to be sent out to old projects.
B. Permanent Solution (The Fix)
A corrected version of the feature was developed and deployed in Merge Request gitlab-org/gitlab!174610. This new implementation introduced robust filtering to ensure notifications are only sent for their intended purpose.
Key Changes:
- Exclusion of Mirrors and Forks: The logic now explicitly checks if a project is a mirror or a fork and prevents notifications for them.
- Validation of Import Source: The code validates that the
project.import_type
is a legitimate, one-time import from a recognized source (e.g.,github
,gitea
,bitbucket_server
) and not an internal process like creating a project from a template.
The corrected logic in app/services/projects/import_completion_service.rb
ensures all conditions are met before sending an email:
Loading syntax highlighting...
This change effectively resolved both the email spam issue and the credential exposure vulnerability.
C. Long-term Improvements (Prevention)
The incident highlighted systemic issues that prompted further action to prevent similar problems in the future:
-
Improved Documentation: Issue gitlab-org/gitlab#508458 was created to "Add documentation of how repository mirroring leverages aspects of project import." This aims to increase cross-team awareness of shared architectural components, making it less likely for engineers to introduce changes with unforeseen side effects on other features.
wortschi commented on 2024-12-09 09:01:15 UTC (in issue #508458):
Project mirroring is owned by ~"group::source code", project import/export is owned by ~"group::import and integrate". However, project mirroring leverages project import functionality. There's no developer documentation how the two features intersect which makes it difficult for engineers from the respective groups to be mindful about side effects when working on one of these two features.
-
Architectural Refactoring: A follow-up issue, gitlab-org/gitlab#511960, was created to "Refactor built-in project creation by template to have its own import_type value." This aims to make the
import_type
field less ambiguous, allowing for clearer and safer logic when handling different types of project creation and import processes.
3. Conclusion
The production incident gitlab-com/gl-infra/production#18929
was caused by a new feature that did not account for the complex and overlapping ways GitLab's import functionality is used, particularly by the repository mirroring feature. This led to both a high volume of unwanted emails and a critical security vulnerability.
The incident was resolved swiftly through a combination of disabling the feature flag and reverting the code. The permanent fix, implemented in !174610
, added the necessary filtering to correctly scope the feature. Key learnings from this event have led to important follow-up actions focused on improving documentation and refactoring ambiguous architectural components to enhance the stability and security of the platform.