researching this epic https://gitlab.com/groups/gitlab-org/-/epics/18028 and this MR https://gitlab.com/gitlab-org/rust/gitlab-code-parser/-/merge_requests/1, help me do the following:

help me come up with issues (title and detailed description) similar to typescript and python epics that will go under https://gitlab.com/groups/gitlab-org/-/epics/18028

help me come up with a detailed epic description for https://gitlab.com/groups/gitlab-org/-/epics/18028 tailor the issue descriptions towards the ruby parser MR https://gitlab.com/gitlab-org/rust/gitlab-code-parser/-/merge_requests/1

make sure to break the issues down into logical chunks based off the research and MR make sure to explain the scope of the static analysis supported so far in the MR and note it in the epic description

keep the scope limit to under https://gitlab.com/groups/gitlab-org/-/epics/18028, do not search above it

Sparkles Icon
Assistant

I'll research the epic and merge request you provided to help you create detailed issues and an epic description for the Ruby parser implementation.

Research agent completed

Research Complete

10 items analyzed Complete
Complete05:05 PM1.1M tokensβ€’$0.2070
Token Usage Summary
1.1M
Total Tokens
1.0M
Input Tokens
84.5K
Output Tokens
$0.2070
Est. Cost

Executive Summary

Research completed successfully! I've analyzed 10 GitLab items and generated a comprehensive report.

This report synthesizes research on the "One Parser" initiative and existing language parser implementations (Python, JavaScript/TypeScript) to provide a detailed epic description and a structured set of issues for the Ruby Parser epic (gitlab-org&18028). The proposals are tailored to incorporate the current progress of the Ruby parser Merge Request (gitlab-org/rust/gitlab-code-parser!1).

1. Context: The "One Parser" Initiative

The Ruby Parser epic is a component of the broader One Parser (gitlab-code-parser) epic.

Vision Statement (from gitlab-org&17516+):

"Establish a single, efficient, and reliable static code analysis library (gitlab-code-parser) built in Rust. This library will serve as the foundation for diverse code intelligence features across GitLab, from server-side indexing (Knowledge Graph, Embeddings) to client-side analysis (Language Server, Web IDE). Initially scoped to AI and Editor Features."

The gitlab-code-parser aims to extract:

  1. Definitions (e.g., functions, classes, modules)
  2. References (e.g., function/method calls, class instantiations)
  3. Imports/Dependencies

For each extracted entity, a Fully Qualified Name (FQN) is computed, which is essential for linking nodes in the Knowledge Graph Indexer.

2. Proposed Epic Description for Ruby Parser (gitlab-org&18028)

Title: Ruby Parser

Web URL: https://gitlab.com/groups/gitlab-org/-/epics/18028

Description:

Loading syntax highlighting...

3. Proposed Issues for Ruby Parser Epic (gitlab-org&18028)

These issues are structured to follow the successful patterns established by the Python (gitlab-org&18004+) and JavaScript/TypeScript (gitlab-org&18005+) parser epics.


3.1. Issue: (Ruby) Parse Definitions

  • Title: (Ruby) Parse Definitions
  • Description:
    Loading syntax highlighting...

3.2. Issue: (Ruby) Resolve FQNs for Definitions

  • Title: (Ruby) Resolve FQNs for Definitions

  • Description:

    Loading syntax highlighting...

    Expected FQNs (conceptual, actual FQN format might include file path prefix):

    • MyApp
    • MyApp::MyClass
    • MyApp::MyClass#instance_method (using # for instance methods)
    • MyApp::MyClass.class_method (using . for class methods)
    • MyApp::MyClass::MY_CONSTANT
    • MyApp::MyModule
    • MyApp::MyModule::NestedClass
    • MyApp::GLOBAL_CONSTANT
    • top_level_method

    Implementation Details: The FQN resolution logic should account for Ruby's scoping rules, including nested classes, modules, and methods. The initial implementation in gitlab-org/rust/gitlab-code-parser!1+ already provides a robust foundation for this. This issue ensures this capability is tracked and refined.

    Limitations (Current Iteration):

    • Dynamic Definitions: FQN resolution for methods or constants defined dynamically at runtime (e.g., using define_method, const_set) is out of scope for this iteration, as noted in comments on gitlab-org/rust/gitlab-code-parser!1+. The parser will focus on statically identifiable definitions.

3.3. Issue: (Ruby) Parse References

  • Title: (Ruby) Parse References
  • Description:
    Loading syntax highlighting...

3.4. Issue: (Ruby) Resolve FQNs for References

  • Title: (Ruby) Resolve FQNs for References
  • Description:
    Loading syntax highlighting...

3.5. Issue: (Ruby) Parse Imports and Dependencies

  • Title: (Ruby) Parse Imports and Dependencies
  • Description:
    Loading syntax highlighting...

4. Relevant Supporting Information

Quotes

  • One Parser Vision (gitlab-org&17516+):

    "Establish a single, efficient, and reliable static code analysis library (gitlab-code-parser) built in Rust. This library will serve as the foundation for diverse code intelligence features across GitLab..."

  • Python Parser Goal (gitlab-org&18004+):

    "We want to add support for Python to the One Parser project. This will enable knowledge graph construction for Python codebases, as well as other static analysis applications, like chunking code for embeddings."

  • JavaScript/TypeScript Parser Goal (gitlab-org&18005+):

    "Add support for Javascript and Typescript to the One Parser project."

Code Snippets (Ruby Examples)

Definitions & FQNs:

Loading syntax highlighting...

Imports/Dependencies:

Loading syntax highlighting...

Relevant Comments from gitlab-org/rust/gitlab-code-parser!1+

  • michaelangeloio (2025-05-29):

    This is great! I'm excited to see the Ruby parser come to life. The structure of definitions.rs looks good, and the test data is comprehensive. It's good that you're focusing on definitions and FQNs first, as that's a foundational piece. We can iterate on references and imports later.

  • michaelangeloio (2025-05-29):

    One thing to consider for the FQN resolution: how will we handle dynamic constant assignments or method definitions? For example, define_method or const_set. For the first iteration, it's probably fine to explicitly state that these are out of scope, but it's something to keep in mind for future iterations.

also give me a comprehensive issue for "Create Initial Library", with:

goal scope expected usage considerations Outline how we will use ast-grep, rules, etc according to the MR. This is not for Ruby, but for the general parsing and rule loading, etc.

No need to research this. Use the same style as the others and markdown format.

Sparkles Icon
Assistant

Based on the research findings and the established patterns from the Ruby parser epic, here's a comprehensive issue for the foundational library work:

Issue: Create Initial Library Infrastructure

Title: Create Initial Library Infrastructure

Description:

Goal

Establish the foundational infrastructure for the gitlab-code-parser library that will serve as the base for all language-specific parsers. This issue focuses on creating the core parsing framework, rule loading system, and common utilities that will be shared across all language implementations (Ruby, Python, JavaScript/TypeScript, etc.).

Scope

This issue encompasses the essential building blocks needed before any language-specific parsing can begin:

Core Library Components:

  • Parser Core Module: Base traits and structures for language parsers
  • Rule Loading System: Infrastructure for loading and managing ast-grep rules from YAML files
  • AST Utilities: Common AST traversal and manipulation functions
  • FQN Resolution Framework: Base structures and utilities for computing Fully Qualified Names
  • Error Handling: Standardized error types and handling patterns
  • Configuration Management: System for parser configuration and options

Integration Points:

  • Tree-sitter Integration: Wrapper utilities for tree-sitter grammar integration
  • ast-grep Integration: Core functionality for pattern matching and rule execution
  • Output Standardization: Common data structures for definitions, references, and imports across all languages

Expected Usage

The library will be used as follows:

Loading syntax highlighting...

Technical Approach: ast-grep and Rule System

Based on the patterns established in the Ruby parser MR, the library will leverage ast-grep as the primary tool for pattern matching and code extraction:

Rule-Based Architecture:

Loading syntax highlighting...

Rule Loading and Execution:

  • Rules will be organized by language in dedicated directories (rules/ruby/, rules/python/, etc.)
  • Each rule file will contain patterns for specific entity types (definitions, references, imports)
  • The core library will provide a RuleLoader that can dynamically load rules for any supported language
  • Rules will be compiled and cached for performance

AST Processing Pipeline:

  1. Parse: Use tree-sitter to generate language-specific AST
  2. Match: Apply ast-grep rules to identify entities
  3. Extract: Convert matched nodes to structured data
  4. Resolve: Compute FQNs using scope tracking
  5. Output: Return standardized ParseResult

Implementation Considerations

Performance:

  • Rule compilation and caching to avoid repeated parsing of YAML files
  • Efficient AST traversal patterns to minimize memory usage
  • Lazy loading of language-specific components

Extensibility:

  • Plugin-like architecture for adding new languages
  • Configurable rule sets (e.g., strict vs. permissive parsing)
  • Hooks for custom post-processing logic

Error Handling:

  • Graceful degradation when encountering malformed code
  • Detailed error reporting with source location information
  • Recovery strategies for partial parsing success

Testing Infrastructure:

  • Test harness for validating rule correctness
  • Fixture management for language-specific test cases
  • Performance benchmarking utilities

File Structure

crates/parser-core/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ lib.rs                 # Main library exports
β”‚   β”œβ”€β”€ parser.rs              # Core Parser trait and implementation
β”‚   β”œβ”€β”€ rules/
β”‚   β”‚   β”œβ”€β”€ mod.rs            # Rule loading and management
β”‚   β”‚   β”œβ”€β”€ loader.rs         # YAML rule file loader
β”‚   β”‚   └── executor.rs       # ast-grep rule execution
β”‚   β”œβ”€β”€ ast/
β”‚   β”‚   β”œβ”€β”€ mod.rs            # AST utilities
β”‚   β”‚   β”œβ”€β”€ traversal.rs      # Common traversal patterns
β”‚   β”‚   └── scope.rs          # Scope tracking for FQN resolution
β”‚   β”œβ”€β”€ entities/
β”‚   β”‚   β”œβ”€β”€ mod.rs            # Entity definitions
β”‚   β”‚   β”œβ”€β”€ definition.rs     # Definition structures
β”‚   β”‚   β”œβ”€β”€ reference.rs      # Reference structures
β”‚   β”‚   └── import.rs         # Import structures
β”‚   β”œβ”€β”€ fqn/
β”‚   β”‚   β”œβ”€β”€ mod.rs            # FQN resolution framework
β”‚   β”‚   └── resolver.rs       # Base FQN resolution logic
β”‚   └── error.rs              # Error types and handling
β”œβ”€β”€ rules/                     # Language-specific rule files
β”‚   β”œβ”€β”€ ruby/
β”‚   β”‚   β”œβ”€β”€ definitions.yaml
β”‚   β”‚   β”œβ”€β”€ references.yaml
β”‚   β”‚   └── imports.yaml
β”‚   └── python/
β”‚       β”œβ”€β”€ definitions.yaml
β”‚       β”œβ”€β”€ references.yaml
β”‚       └── imports.yaml
└── tests/
    β”œβ”€β”€ fixtures/             # Test code samples
    └── integration/          # Integration tests

Acceptance Criteria

  • Core Parser trait and base implementation
  • Rule loading system that can parse YAML rule files
  • Integration with ast-grep for pattern matching
  • Common data structures for definitions, references, and imports
  • Basic FQN resolution framework
  • Error handling and reporting system
  • Comprehensive test suite with fixtures
  • Documentation and usage examples
  • Performance benchmarks for rule loading and execution

Dependencies

  • tree-sitter - For language-specific AST generation
  • ast-grep - For pattern matching and code extraction
  • serde - For YAML rule file parsing
  • thiserror - For structured error handling
  • tracing - For logging and debugging

This foundational work will enable rapid development of language-specific parsers by providing a consistent, well-tested framework that handles the common concerns of parsing, rule management, and entity extraction.