When a developer installs a code analysis tool on their repository, the first questions they should ask are: what does this thing actually have access to, what does it do with my code, and what happens to the analysis results? This post answers those questions for DevPulsr, with as much technical specificity as we can provide.
The GitHub App installation model
DevPulsr connects to GitHub through the GitHub Apps platform. GitHub Apps are the modern, scoped alternative to personal access tokens — they're installed at the organization or repository level and request only the permissions they actually need, which are listed explicitly during installation.
The permissions DevPulsr requests on GitHub are:
- Repository contents: read. We need to read the source files referenced by a PR diff to build the call graph and API contract context for analysis. We read the default branch at install time (for the initial index) and the PR head branch at analysis time.
- Pull requests: read and write. Read: to access the diff, the PR metadata, and the list of changed files. Write: to post inline review comments and a summary check run result back to the PR.
- Check runs: write. To post a GitHub check status (pass/warn/fail) that appears in the PR's status checks section.
That's the complete permissions list. We do not request access to issues, wikis, deployments, secrets, organization members, or any other resource. During the installation flow, GitHub shows you each permission scope with its rationale before you confirm.
The GitLab integration model
GitLab's integration model uses a combination of a system webhook and a project access token (or a GitLab OAuth application for organizational installs).
The webhook is configured at the project or group level and sends merge_request events to DevPulsr's ingestion endpoint when a MR is opened or updated. The access token provides read access to the repository contents and write access to MR notes (GitLab's term for inline comments).
For GitLab self-managed instances, the webhook URL and token are configured manually. For GitLab.com, the setup flow handles this automatically via OAuth. In both cases, the token scope requested is:
read_repository
write_merge_request_comments
No broader scope is requested or needed. The token does not have access to CI/CD configuration, container registry, project settings, or member management.
What happens when a PR is opened
Here's the sequence of events when a PR is opened on a connected repository:
- Webhook received. GitHub sends a
pull_requestevent (type:openedorsynchronize) to DevPulsr's webhook endpoint. The payload contains PR metadata: number, head SHA, base SHA, changed files list, author. - Diff fetch. DevPulsr fetches the raw diff for the PR using the GitHub API's compare endpoint. The diff is the same content you see when you click "Files changed" in the GitHub UI.
- Context fetch. For each changed file in the diff, DevPulsr fetches the full file content from the head branch. This is necessary to resolve function call sites and build the call graph context around the changed lines. We also fetch the relevant parts of the base branch for comparison.
- Analysis. The fetched content is passed to the analysis engine in an isolated compute environment. The analysis runs: AST parsing, call graph construction, semantic invariant extraction, test coverage mapping (using the test paths identified in the initial index), and API spec comparison.
- Results posted. Findings are posted as inline review comments on the PR, tagged to the specific lines where the issue exists. A summary check run result is posted with an overall severity indicator.
- Code discarded. The source content fetched in steps 3 is deleted from the analysis environment. We retain: the PR ID, the list of flagged findings (file path, line number, flag type, severity), and aggregate metadata for your dashboard. We do not retain raw source code.
The initial index: what it does and what we store
On first install, DevPulsr builds an initial index of your default branch. This is necessary to have a baseline call graph and API contract state to compare PRs against. Without this baseline, we can't tell whether a function's behavior has changed or whether an API response shape is different from what consumers expect.
The initial index extracts: the call graph (which functions call which other functions), the function-level abstract syntax tree (AST) signatures (parameter types, return types, exception signatures), and the API endpoint schema if OpenAPI spec files are present in the repo.
We store the structural representation of the codebase — the graph and the signatures — not the raw source code. Think of it as storing the index of a book rather than the book itself: you can tell what chapters exist and how they reference each other, but you can't reconstruct the text from the index alone.
This stored representation is deleted when you uninstall DevPulsr, and is not used for any purpose other than providing context for analysis of your own PRs.
Analysis latency and reliability
For repositories up to roughly 500,000 lines of code (a large-but-common size for a monorepo with multiple services), the typical analysis latency is 15–45 seconds from webhook receipt to comments posted. This is fast enough that the analysis results are visible before most reviewers have opened the PR.
For larger repositories, or for PRs with a high number of changed files, analysis can take 2–4 minutes. In these cases, a pending check status is shown in GitHub/GitLab while analysis is running, so reviewers know it's in progress.
Reliability: DevPulsr uses a queued analysis architecture with idempotent processing, so if a webhook delivery is retried (both GitHub and GitLab retry failed deliveries), the analysis won't post duplicate comments. If analysis fails for any reason (network error, parsing error on unusual code patterns), the PR check status is set to neutral rather than blocked — we never cause a PR to fail to merge due to an analysis error.
What we never do
A few explicit commitments:
- We do not store your raw source code beyond the duration of the analysis job.
- We do not use your code or analysis results to train or fine-tune any model.
- We do not share analysis data across customer accounts — your codebase's structural representation is isolated in your organization's data partition.
- We do not request permissions beyond what's listed above. If a future feature required additional permissions, that would require explicit re-authorization through the GitHub Apps permission update flow.
If you have questions about specific security or compliance requirements — SOC 2, data residency, enterprise proxy configurations — the right path is a conversation with us directly. Reach out at [email protected] or through the contact page.