Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track the actor that triggers the minion task #14829

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

shounakmk219
Copy link
Collaborator

Description

Minion task can be triggered by 3 ways as of now

  1. Cron schedule
  2. Manual schedule
  3. Adhoc task execution

To identify which actor triggered a particular task run, this PR injects the information in task configs so that it gets tracked in zk for reference.

The triggeredBy info is also made available on TaskDebugInfo and SubtaskDebugInfo.

Refactor on PinotTaskManager

There were too many derivations of schedule methods in PinotTaskManager based on what combination of tables and task types are supposed to be scheduled. This made the changes on schedule inputs hard to manage.
This PR adds a wrapper class TaskSchedulingContext that will be passed to only one scheduleTasks() method to achieve all the derivations.

@shounakmk219 shounakmk219 changed the title Track the actor that triggers the task Track the actor that triggers the minion task Jan 16, 2025
@codecov-commenter
Copy link

codecov-commenter commented Jan 16, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.24%. Comparing base (59551e4) to head (7040e3b).
Report is 1648 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (59551e4) and HEAD (7040e3b). Click for more details.

HEAD has 44 uploads less than BASE
Flag BASE (59551e4) HEAD (7040e3b)
integration 7 0
integration2 3 0
temurin 12 3
java-21 7 2
skip-bytebuffers-true 3 1
skip-bytebuffers-false 7 2
unittests 5 3
java-11 5 1
unittests2 3 0
integration1 2 0
custom-integration1 2 0
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14829      +/-   ##
============================================
- Coverage     61.75%   56.24%   -5.52%     
- Complexity      207      796     +589     
============================================
  Files          2436     2120     -316     
  Lines        133233   114478   -18755     
  Branches      20636    18410    -2226     
============================================
- Hits          82274    64384   -17890     
+ Misses        44911    44862      -49     
+ Partials       6048     5232     -816     
Flag Coverage Δ
custom-integration1 ?
integration ?
integration1 ?
integration2 ?
java-11 56.17% <ø> (-5.54%) ⬇️
java-21 56.10% <ø> (-5.53%) ⬇️
skip-bytebuffers-false 56.22% <ø> (-5.53%) ⬇️
skip-bytebuffers-true 56.05% <ø> (+28.32%) ⬆️
temurin 56.24% <ø> (-5.52%) ⬇️
unittests 56.24% <ø> (-5.51%) ⬇️
unittests1 56.24% <ø> (+9.35%) ⬆️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* It might be called from the non-leader controller.
* Returns a map from the task type to the {@link TaskSchedulingInfo} of tasks scheduled.
*/
public synchronized Map<String, TaskSchedulingInfo> scheduleAllTasksForAllTables(@Nullable String minionInstanceTag) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest rather than removing them add a @deprecated annotation on top, along with the method that they should be using. We can remove this in next release

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will revert removed methods

@@ -114,6 +114,10 @@ public class PinotTaskManager extends ControllerPeriodicTask<Void> {

private final TaskManagerStatusCache<TaskGeneratorMostRecentRunInfo> _taskManagerStatusCache;

public enum Triggers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to move this outside

Map<String, List<TableConfig>> enabledTableConfigMap = new HashMap<>();
for (String tableNameWithType : tableNamesWithType) {
Map<String, Set<String>> tableToTasksMap = context.getTableToTaskNamesMap();
if (context.getTableToTaskNamesMap().isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can simply use tableToTasksMap here

if (context.getTableToTaskNamesMap().isEmpty()) {
_pinotHelixResourceManager.getAllTables().forEach(table -> tableToTasksMap.put(table, null));
}
for (Map.Entry<String, Set<String>> entry : context.getTableToTaskNamesMap().entrySet()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as before

/**
* Helper method to schedule tasks (all task types) for the given tables that have the tasks enabled.
* Returns a map from the task type to the {@link TaskSchedulingInfo} of the tasks scheduled.
*/
protected synchronized Map<String, TaskSchedulingInfo> scheduleTasks(List<String> tableNamesWithType,
boolean isLeader, @Nullable String minionInstanceTag) {
public synchronized Map<String, TaskSchedulingInfo> scheduleTasks(TaskSchedulingContext context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this to be public? let's keep it as protected

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is being called everywhere hence kept it public

for (String tableNameWithType : tableNamesWithType) {
Map<String, Set<String>> tableToTasksMap = context.getTableToTaskNamesMap();
if (context.getTableToTaskNamesMap().isEmpty()) {
_pinotHelixResourceManager.getAllTables().forEach(table -> tableToTasksMap.put(table, null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an entry for all tables is probably not optimal, see if there is a way around this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, let me refactor the code to avoid adding entries

@KKcorps
Copy link
Contributor

KKcorps commented Jan 30, 2025

Add missing license headers

@shounakmk219 shounakmk219 force-pushed the task-trigger-tracking branch from 79a8879 to 7040e3b Compare January 30, 2025 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants