Accessing exported instance data
Accessing data exported from your instance
For controlling or compliance purposes, you may want to see the exact data that has been shared from your Enterprise instance. All data shared from the instance ends up in an S3 bucket located in an AWS account owned by Gitpod. See the Data and Observability for more information on the observability architecture.
Accessing the Data Shared
Customers can access the S3 bucket where the data is stored from any role/user in the Gitpod Instance’s AWS account by following the following steps:
-
Upon request, your Gitpod account manager can give you the name of the S3 bucket where the data from your instance is sent to.
-
Set up the AWS CLI environment to assume any role or user in the AWS account where Gitpod is installed into. For example, whatever user or role used to apply the CloudFormation template to install Gitpod can be used.
-
Use the CLI to inspect the data
The storage format depends on the telemetry type. For non-metrics data, the files can be directly inspected. For metrics data, see the instructions below.
Escalation Process for Data Leaks
In case any data is found in the S3 bucket that contains personally identifiable or confidential information that should not have been leaked, the process for notifying Gitpod and remediating the issue is as follows:
- Customer can access data to identify potentially sensitive data leaks: Customers are able to inspect any data that was sent to Gitpod by gaining access to the S3 bucket where all data from an instance is sent to (see “Accessing the Data Shared” above).
- In addition to this, Gitpod is continuously monitoring data from internal Enterprise instances for potential data leaks using a third party sensitive data discovery tool (AWS Macie). If any data leaks are discovered that also apply to customer instances, the process below is also followed. For more on the active data sanitisation mechanisms, please see Observability and Data
- Customer informs of data leak: Upon identification of confidential data leakage, a customer can trigger security incident via their Gitpod account manager.
- Data is deleted: The data that was “leaked” is identified and measures are taken to delete it in S3 and then further in any third party systems.
- For S3 there is the option to delete the entire bucket. In any case, the data in this bucket is configured to have a very short retention. See Observability and Data.
- If the effort is deemed worthwhile, the data can also be deleted individually
- For 3rd party services, details will depend on the service and the data that was leaked.
- Improvements made: The root cause of why the data leaked is identified, and measures are put in place to prevent this from occurring again.
Was this page helpful?