researching this epic https://gitlab.com/groups/gitlab-org/-/epics/18028 and this MR https://gitlab.com/gitlab-org/rust/gitlab-code-parser/-/merge_requests/1, help me do the following:
help me come up with issues (title and detailed description) similar to typescript and python epics that will go under https://gitlab.com/groups/gitlab-org/-/epics/18028
help me come up with a detailed epic description for https://gitlab.com/groups/gitlab-org/-/epics/18028 tailor the issue descriptions towards the ruby parser MR https://gitlab.com/gitlab-org/rust/gitlab-code-parser/-/merge_requests/1
make sure to break the issues down into logical chunks based off the research and MR make sure to explain the scope of the static analysis supported so far in the MR and note it in the epic description
keep the scope limit to under https://gitlab.com/groups/gitlab-org/-/epics/18028, do not search above it
I'll research the epic and merge request you provided to help you create detailed issues and an epic description for the Ruby parser implementation.
Research Complete
Executive Summary
Research completed successfully! I've analyzed 10 GitLab items and generated a comprehensive report.
This report synthesizes research on the "One Parser" initiative and existing language parser implementations (Python, JavaScript/TypeScript) to provide a detailed epic description and a structured set of issues for the Ruby Parser epic (gitlab-org&18028
). The proposals are tailored to incorporate the current progress of the Ruby parser Merge Request (gitlab-org/rust/gitlab-code-parser!1
).
1. Context: The "One Parser" Initiative
The Ruby Parser epic is a component of the broader One Parser (gitlab-code-parser) epic.
Vision Statement (from gitlab-org&17516+):
"Establish a single, efficient, and reliable static code analysis library (
gitlab-code-parser
) built in Rust. This library will serve as the foundation for diverse code intelligence features across GitLab, from server-side indexing (Knowledge Graph, Embeddings) to client-side analysis (Language Server, Web IDE). Initially scoped to AI and Editor Features."
The gitlab-code-parser
aims to extract:
- Definitions (e.g., functions, classes, modules)
- References (e.g., function/method calls, class instantiations)
- Imports/Dependencies
For each extracted entity, a Fully Qualified Name (FQN) is computed, which is essential for linking nodes in the Knowledge Graph Indexer.
2. Proposed Epic Description for Ruby Parser (gitlab-org&18028
)
Title: Ruby Parser
Web URL: https://gitlab.com/groups/gitlab-org/-/epics/18028
Description:
Loading syntax highlighting...
3. Proposed Issues for Ruby Parser Epic (gitlab-org&18028
)
These issues are structured to follow the successful patterns established by the Python (gitlab-org&18004+) and JavaScript/TypeScript (gitlab-org&18005+) parser epics.
3.1. Issue: (Ruby) Parse Definitions
- Title:
(Ruby) Parse Definitions
- Description:
Loading syntax highlighting...
3.2. Issue: (Ruby) Resolve FQNs for Definitions
-
Title:
(Ruby) Resolve FQNs for Definitions
-
Description:
Loading syntax highlighting...
Expected FQNs (conceptual, actual FQN format might include file path prefix):
MyApp
MyApp::MyClass
MyApp::MyClass#instance_method
(using#
for instance methods)MyApp::MyClass.class_method
(using.
for class methods)MyApp::MyClass::MY_CONSTANT
MyApp::MyModule
MyApp::MyModule::NestedClass
MyApp::GLOBAL_CONSTANT
top_level_method
Implementation Details: The FQN resolution logic should account for Ruby's scoping rules, including nested classes, modules, and methods. The initial implementation in gitlab-org/rust/gitlab-code-parser!1+ already provides a robust foundation for this. This issue ensures this capability is tracked and refined.
Limitations (Current Iteration):
- Dynamic Definitions: FQN resolution for methods or constants defined dynamically at runtime (e.g., using
define_method
,const_set
) is out of scope for this iteration, as noted in comments on gitlab-org/rust/gitlab-code-parser!1+. The parser will focus on statically identifiable definitions.
3.3. Issue: (Ruby) Parse References
- Title:
(Ruby) Parse References
- Description:
Loading syntax highlighting...
3.4. Issue: (Ruby) Resolve FQNs for References
- Title:
(Ruby) Resolve FQNs for References
- Description:
Loading syntax highlighting...
3.5. Issue: (Ruby) Parse Imports and Dependencies
- Title:
(Ruby) Parse Imports and Dependencies
- Description:
Loading syntax highlighting...
4. Relevant Supporting Information
Quotes
-
One Parser Vision (gitlab-org&17516+):
"Establish a single, efficient, and reliable static code analysis library (
gitlab-code-parser
) built in Rust. This library will serve as the foundation for diverse code intelligence features across GitLab..." -
Python Parser Goal (gitlab-org&18004+):
"We want to add support for Python to the One Parser project. This will enable knowledge graph construction for Python codebases, as well as other static analysis applications, like chunking code for embeddings."
-
JavaScript/TypeScript Parser Goal (gitlab-org&18005+):
"Add support for Javascript and Typescript to the One Parser project."
Code Snippets (Ruby Examples)
Definitions & FQNs:
Loading syntax highlighting...
Imports/Dependencies:
Loading syntax highlighting...
Relevant Comments from gitlab-org/rust/gitlab-code-parser!1+
-
michaelangeloio (2025-05-29):
This is great! I'm excited to see the Ruby parser come to life. The structure of
definitions.rs
looks good, and the test data is comprehensive. It's good that you're focusing on definitions and FQNs first, as that's a foundational piece. We can iterate on references and imports later. -
michaelangeloio (2025-05-29):
One thing to consider for the FQN resolution: how will we handle dynamic constant assignments or method definitions? For example,
define_method
orconst_set
. For the first iteration, it's probably fine to explicitly state that these are out of scope, but it's something to keep in mind for future iterations.
also give me a comprehensive issue for "Create Initial Library", with:
goal scope expected usage considerations Outline how we will use ast-grep, rules, etc according to the MR. This is not for Ruby, but for the general parsing and rule loading, etc.
No need to research this. Use the same style as the others and markdown format.
Based on the research findings and the established patterns from the Ruby parser epic, here's a comprehensive issue for the foundational library work:
Issue: Create Initial Library Infrastructure
Title: Create Initial Library Infrastructure
Description:
Goal
Establish the foundational infrastructure for the gitlab-code-parser
library that will serve as the base for all language-specific parsers. This issue focuses on creating the core parsing framework, rule loading system, and common utilities that will be shared across all language implementations (Ruby, Python, JavaScript/TypeScript, etc.).
Scope
This issue encompasses the essential building blocks needed before any language-specific parsing can begin:
Core Library Components:
- Parser Core Module: Base traits and structures for language parsers
- Rule Loading System: Infrastructure for loading and managing
ast-grep
rules from YAML files - AST Utilities: Common AST traversal and manipulation functions
- FQN Resolution Framework: Base structures and utilities for computing Fully Qualified Names
- Error Handling: Standardized error types and handling patterns
- Configuration Management: System for parser configuration and options
Integration Points:
- Tree-sitter Integration: Wrapper utilities for tree-sitter grammar integration
- ast-grep Integration: Core functionality for pattern matching and rule execution
- Output Standardization: Common data structures for definitions, references, and imports across all languages
Expected Usage
The library will be used as follows:
Loading syntax highlighting...
Technical Approach: ast-grep and Rule System
Based on the patterns established in the Ruby parser MR, the library will leverage ast-grep
as the primary tool for pattern matching and code extraction:
Rule-Based Architecture:
Loading syntax highlighting...
Rule Loading and Execution:
- Rules will be organized by language in dedicated directories (
rules/ruby/
,rules/python/
, etc.) - Each rule file will contain patterns for specific entity types (definitions, references, imports)
- The core library will provide a
RuleLoader
that can dynamically load rules for any supported language - Rules will be compiled and cached for performance
AST Processing Pipeline:
- Parse: Use tree-sitter to generate language-specific AST
- Match: Apply ast-grep rules to identify entities
- Extract: Convert matched nodes to structured data
- Resolve: Compute FQNs using scope tracking
- Output: Return standardized
ParseResult
Implementation Considerations
Performance:
- Rule compilation and caching to avoid repeated parsing of YAML files
- Efficient AST traversal patterns to minimize memory usage
- Lazy loading of language-specific components
Extensibility:
- Plugin-like architecture for adding new languages
- Configurable rule sets (e.g., strict vs. permissive parsing)
- Hooks for custom post-processing logic
Error Handling:
- Graceful degradation when encountering malformed code
- Detailed error reporting with source location information
- Recovery strategies for partial parsing success
Testing Infrastructure:
- Test harness for validating rule correctness
- Fixture management for language-specific test cases
- Performance benchmarking utilities
File Structure
crates/parser-core/
βββ src/
β βββ lib.rs # Main library exports
β βββ parser.rs # Core Parser trait and implementation
β βββ rules/
β β βββ mod.rs # Rule loading and management
β β βββ loader.rs # YAML rule file loader
β β βββ executor.rs # ast-grep rule execution
β βββ ast/
β β βββ mod.rs # AST utilities
β β βββ traversal.rs # Common traversal patterns
β β βββ scope.rs # Scope tracking for FQN resolution
β βββ entities/
β β βββ mod.rs # Entity definitions
β β βββ definition.rs # Definition structures
β β βββ reference.rs # Reference structures
β β βββ import.rs # Import structures
β βββ fqn/
β β βββ mod.rs # FQN resolution framework
β β βββ resolver.rs # Base FQN resolution logic
β βββ error.rs # Error types and handling
βββ rules/ # Language-specific rule files
β βββ ruby/
β β βββ definitions.yaml
β β βββ references.yaml
β β βββ imports.yaml
β βββ python/
β βββ definitions.yaml
β βββ references.yaml
β βββ imports.yaml
βββ tests/
βββ fixtures/ # Test code samples
βββ integration/ # Integration tests
Acceptance Criteria
- Core
Parser
trait and base implementation - Rule loading system that can parse YAML rule files
- Integration with
ast-grep
for pattern matching - Common data structures for definitions, references, and imports
- Basic FQN resolution framework
- Error handling and reporting system
- Comprehensive test suite with fixtures
- Documentation and usage examples
- Performance benchmarks for rule loading and execution
Dependencies
tree-sitter
- For language-specific AST generationast-grep
- For pattern matching and code extractionserde
- For YAML rule file parsingthiserror
- For structured error handlingtracing
- For logging and debugging
This foundational work will enable rapid development of language-specific parsers by providing a consistent, well-tested framework that handles the common concerns of parsing, rule management, and entity extraction.