Skip to content

Count variable occurrences #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 1, 2025
Merged

Count variable occurrences #39

merged 1 commit into from
Mar 1, 2025

Conversation

dcz-self
Copy link

@dcz-self dcz-self commented Feb 8, 2025

This helps find typos, because logru otherwise doesn't warn about underspecified variables.

This is a quick and simple prototype.

Things that could be done differently:

  • abstracting VarScope into a trait, although it's probably overkill
  • VarScope.name is made public. Not sure what other interface should be used to copy one. ::new(Vec<_>) or ::from_iter? And then ::empty() for creating an empty one.
  • currently it's just a println. tracing::warn would be another "worse is better" solution, but tracing is not imported by logru. Otherwise this could turn into a warning interface, but I didn't want to go into that rabbit hole before getting a general approval.

@dcz-self
Copy link
Author

dcz-self commented Feb 8, 2025

Example of a repl session:

?- :define dupa(B) :- A.
Some variables appear only once in this rule: B, A,
are those typos?
Defined!

This will mark all free variables as suspicious, so the one way to silence this warning is to replace the variable with _.

This introduces the analysis module and a warning in the repl whenever a variable is defined with only one occurrence.
Copy link
Owner

@fatho fatho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a nice thing to have, thanks for the suggestion! I do however have two concerns with the proposed implementation:

  1. This sounds more like a program analysis (albeit a very simple one), so I don't think it should be the parser's business to figure this out. Instead, this would be well suited as a separate function (fn unique_named_vars(rule: &Rule) -> HashSet<Var>) that takes a rule and returns all unique named variables based on the enclosed scope. Then we also don't need this VarScopeCounted.
  2. Since all of this is library code, it should never write directly to stderr, since we cannot assume that the application actually using it is fine with that. Instead, the warning can then be implemented as part of the repl by calling the unique_named_vars function on rules defined within the REPL and printing an appropriate message directly from the REPL, rather than library code.

@dcz-self dcz-self force-pushed the typos branch 2 times, most recently from 8178cab to 06059f6 Compare February 15, 2025 17:41
@dcz-self
Copy link
Author

That would be done.

@dcz-self dcz-self changed the title RFC: Count variable occurrences Count variable occurrences Feb 15, 2025
Copy link
Owner

@fatho fatho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this, looks good! I still have a small suggestion below:

src/textual.rs Outdated
Comment on lines 94 to 111
/// Load a set of rules from a string.
pub fn load_str(&mut self, rules: &str) -> Result<(), ParseError> {
pub fn load_str(&mut self, rules: &str) -> Result<Vec<Rule>, ParseError> {
let rules = Parser::new(&mut self.symbols).parse_rules_str(rules)?;
for rule in rules {
self.rules.insert(rule);
for rule in &rules {
self.rules.insert(rule.clone());
}
Ok(())
Ok(rules)
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most use sites of this function don't care about the results, so often we'd be cloning the rules unnecessarily.

One thing I can think of is splitting this like so

    /// Parse a set of rules using the symbols defined in this universe.
    pub fn parse_rules(&mut self, rules: &str) -> Result<Vec<Rule>, ParseError> {
        Parser::new(&mut self.symbols).parse_rules_str(rules)
    }

    /// Insert rules previously parsed using [`Self::parse_rules`].
    pub fn insert_rules(&mut self, rules: Vec<Rule>) {
        for rule in rules {
            self.rules.insert(rule);
        }
    }

    /// Load a set of rules from a string.
    pub fn load_str(&mut self, rules: &str) -> Result<(), ParseError> {
        let rules = self.parse_rules(rules)?;
        self.insert_rules(rules);
        Ok(())
    }

Places that don't care about the rules can continue to use load_str, while in other places, one can first run parse_rules, do any analysis as desired, and then call insert_rules once done.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this a lot because this makes it possible to inject Rule objects from a different Universe (so with mismatched symbols), but I can't think of anything else without cloning.

Done.

@dcz-self dcz-self force-pushed the typos branch 2 times, most recently from ffd0206 to 5e8af4c Compare February 16, 2025 13:06
Copy link
Owner

@fatho fatho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fatho fatho merged commit aa08c96 into fatho:main Mar 1, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants