Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial setup for concurrent workers #46

Merged
merged 10 commits into from
Dec 11, 2024
Merged

Initial setup for concurrent workers #46

merged 10 commits into from
Dec 11, 2024

Conversation

erbesharat
Copy link
Member

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Description

This PR refactors the split functionality to leverage Node.js Worker Threads, enabling concurrent processing. Previously, the splitting operations ran sequentially, which could be time-consuming for larger projects and multiple endpoints. By introducing parallelization with worker threads, we significantly improve the speed and efficiency of the process.

Related Issue

Issue Number: #41

Notes for Reviewer:

@florianbgt We'll not have to worry about thread count and stuff since worker_threads handles everything automatically pretty well and doesn't cause leaks. We can add an extra layer of checking on that but rather to have that in a separate PR to not block this since it's a good improvement already.

Checklist

  • I have read the contributing guidelines
  • My code follows the code style of this project.
  • I have added tests that prove my feature works
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

.catch((error) => {
console.error(error);
throw error;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response is sent before the worker or done with the jobs.
Should wait until the worker are done before sending the response

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code to wait for the response 🟢

import Parser from "tree-sitter";
import assert from "assert";
import { getLanguagePlugin } from "../languagesPlugins";
import { DepExport } from "../languagesPlugins/types";

class SplitRunner {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete this class, become kind of useless now. We can do a simple method to replace it.
Maybe in another PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#48


console.time("Remove errors");
removeErrors();
console.timeEnd("Remove errors");
Copy link
Member

@florianbgt florianbgt Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find a better way to log. Maybe with a fancier logger. Cause here since they now run in parallel, very hard to read the logs. Also probably in another PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#47

@florianbgt florianbgt merged commit 4e01be7 into main Dec 11, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants