Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipeline refactor and cluster watch #731

Merged
merged 18 commits into from
Jul 5, 2021

Conversation

chengjoey
Copy link
Member

What type of this PR

Add one of the following kinds:
/kind feature

What this PR does / why we need it:

pipeline refactor, add three executor in scheduler plugin(k8sjob, k8sflink, k8sspark), dcos and edas will dispath to scheduler module
get cluster from cluster-manager module, and register cluster hook

Which issue(s) this PR fixes:

Specified Reviewers:

/assign @your-reviewer

@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from 1fb34d4 to 574c817 Compare June 30, 2021 07:07
@codecov
Copy link

codecov bot commented Jun 30, 2021

Codecov Report

Merging #731 (ec2b5a1) into master (93784d3) will increase coverage by 0.03%.
The diff coverage is 0.00%.

❗ Current head ec2b5a1 differs from pull request most recent head a58ff1a. Consider uploading reports for the commit a58ff1a to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #731      +/-   ##
==========================================
+ Coverage   10.14%   10.18%   +0.03%     
==========================================
  Files         940      937       -3     
  Lines       88167    87840     -327     
==========================================
  Hits         8946     8946              
+ Misses      78287    77961     -326     
+ Partials      934      933       -1     
Impacted Files Coverage Δ
...gine/actionexecutor/plugins/scheduler/scheduler.go 0.00% <0.00%> (ø)
...dules/pipeline/services/pipelinesvc/clusterinfo.go 0.00% <0.00%> (ø)
modules/pipeline/spec/pipeline_configs.go 0.00% <ø> (ø)
pkg/mock/mock.go 24.65% <0.00%> (ø)

@Effet Effet requested a review from sfwn July 1, 2021 01:47
@Effet Effet added pipeline pipeline service refactor labels Jul 1, 2021
)

const (
ClusterTypeDcos = "dcos"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse definitions at line 20~22 in this file.

@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from 574c817 to 695a835 Compare July 1, 2021 13:14
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
errStr := fmt.Sprintf("decode clusterhook request fail: %v", err)
logrus.Error(errStr)
return httpserver.HTTPResponse{Status: http.StatusBadRequest, Content: errStr}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use httpserver.ErrResp

@@ -245,5 +245,8 @@ func (e *Endpoints) Routes() []httpserver.Endpoint {
// reports
{Path: "/api/pipeline-reportsets/{pipelineID}", Method: http.MethodGet, Handler: e.queryPipelineReportSet},
{Path: "/api/pipeline-reportsets", Method: http.MethodGet, Handler: e.pagingPipelineReportSets},

// cluster info
{Path: "/clusterhook", Method: http.MethodPost, Handler: e.clusterHook},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clusterhook for what, you should specified in path.

ev := apistructs.CreateHookRequest{
Name: "pipeline_watch_cluster_changed",
Events: []string{bundle.ClusterEvent},
URL: strutil.Concat("http://", discover.Pipeline(), "/clusterhook"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use same const variable with endpoints.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about multi instance?

func (e *Endpoints) clusterHook(ctx context.Context, r *http.Request, vars map[string]string) (httpserver.Responser, error) {
req := apistructs.ClusterEvent{}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
errStr := fmt.Sprintf("decode clusterhook request fail: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed to xxx

}
}

func (m *Manager) updateExecutor(cluster apistructs.ClusterInfo) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated with addExecutor.

@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from 695a835 to 5fba941 Compare July 4, 2021 03:09
@@ -228,6 +228,8 @@ github.com/cheekybits/is v0.0.0-20150225183255-68e9c0620927/go.mod h1:h/aW8ynjgk
github.com/cheggaaa/pb/v3 v3.0.1/go.mod h1:SqqeMF/pMOIu3xgGoxtPYhMNQP258xE4x/XRTYua+KU=
github.com/cheggaaa/pb/v3 v3.0.4 h1:QZEPYOj2ix6d5oEg63fbHmpolrnNiwjUsk+h74Yt4bM=
github.com/cheggaaa/pb/v3 v3.0.4/go.mod h1:7rgWxLrAUcFMkvJuv09+DYi7mMUYi8nO9iOWcvGJPfw=
github.com/chengjoey/flink-on-k8s-operator v0.0.1 h1:a4zrgATUtbQXFrfz0OAH1C9crs4fojzp2k25oNlgfl0=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfwn sfwn changed the title Feature/pipeline refactor and cluster watch pipeline refactor and cluster watch Jul 4, 2021
func (e *Endpoints) clusterHook(ctx context.Context, r *http.Request, vars map[string]string) (httpserver.Responser, error) {
req := apistructs.ClusterEvent{}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
errStr := fmt.Sprintf("failed to decode clusterhook request err: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed to decode clusterhook request, err: %v

return httpserver.ErrResp(http.StatusBadRequest, "", errStr)
}
if err := e.pipelineSvc.ClusterHook(req); err != nil {
errStr := fmt.Sprintf("failed to handle cluster event: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed to handle cluster event, err: %v

logrus.Error(errStr)
return httpserver.ErrResp(http.StatusBadRequest, "", errStr)
}
return httpserver.HTTPResponse{Status: http.StatusOK}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httpserver.OkResp

@@ -37,6 +37,10 @@ import (
"github.com/erda-project/erda/pkg/http/httpserver"
)

const (
ClusterHookApiPath = "/api/pipeline-clusters/hook"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/api/{resource}/actions/{concreteAction}

@@ -194,6 +201,9 @@ func do() (*httpserver.Server, error) {
return nil, err
}

// register cluster hook
registerClusterHook(bdl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put registerClusterHook into pkg clusterinfo is better?
Put these two methods together at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or only one method here such as doClusterAbout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scheduler initialze need cluster info, so clusterinfo.initialize should make first
cluster hook should register after pipeline service start

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move registerClusterHook into clusterinfo.Initialize

},
}
if err := bdl.CreateWebhook(ev); err != nil {
logrus.Warnf("failed to register watch cluster changed event, %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return err and handle err at invoker side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think pipeline should panic if registerClusterHook failed at bootsstrap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each time the pod ip is different.
How about old pod ips already registered? Maybe eventbox will continue send msgs to them but failed of course. And the subscribers(pod ips) will be more and more ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, maybe we should continue register with service ip and all instances listen from etcd.

}
go m.listenClusterEventSync(context.Background(), eventChan)

logrus.Info("pipengine task executor manager Initialize Done .")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline scheduler task executor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check all others.

)

const (
DiceRootDomainKEY = "DICE_ROOT_DOMAIN"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key

package k8sjob

var (
errPullImage = "拉取镜像失败"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later with i18n

"context"
"fmt"

"github.com/gogap/errors"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check all error pkg, use fmt.Errorf or unified github.com/pkg/errors.

@@ -0,0 +1,45 @@
// Copyright (c) 2021 Terminus, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo of file name schedule.

if data == nil {
return nil
}
jobID, ok := data.(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check task executor type here.
Not all string type data is job id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only dcos return job id, so remove inJectJobID

@@ -159,6 +164,19 @@ func (s *start) TuneTriggers() taskrun.TaskOpTuneTriggers {
}
}

// injectJobID save flink, spark job id after start
func injectJobID(tr *taskrun.TaskRun, data interface{}) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

injectJobIDForFlinkSpark is better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only dcos return str job id,already remove injectJobID

@@ -18,9 +18,11 @@ import (
"strings"
"time"

"github.com/gogap/errors"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same.


// ClusterHook listen and dispatch cluster event from eventbox
func (s *PipelineSvc) ClusterHook(clusterEvent apistructs.ClusterEvent) error {
if !strutil.Equal(clusterEvent.Content.Type, apistructs.K8S, true) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use switch grammar is more clear here.

@@ -95,6 +95,8 @@ type PipelineTaskExtra struct {
LoopOptions *apistructs.PipelineTaskLoopOptions `json:"loopOptions,omitempty"` // 开始执行后保证不为空

AppliedResources apistructs.PipelineAppliedResources `json:"appliedResources,omitempty"`

JobID string `json:"jobID,omitempty"` // flink, spark job id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concrete field name.

@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from 5fba941 to 4c14eb3 Compare July 4, 2021 16:33
@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from 4c14eb3 to d3b5121 Compare July 5, 2021 01:40
@@ -194,6 +201,9 @@ func do() (*httpserver.Server, error) {
return nil, err
}

// register cluster hook
registerClusterHook(bdl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move registerClusterHook into clusterinfo.Initialize

@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from d3b5121 to 9ce8942 Compare July 5, 2021 02:00
@chengjoey chengjoey force-pushed the feature/pipeline-clusterinfo branch from 9ce8942 to 39047d4 Compare July 5, 2021 03:25
@sfwn
Copy link
Member

sfwn commented Jul 5, 2021

/approve

@erda-bot erda-bot merged commit 5860529 into erda-project:master Jul 5, 2021
@chengjoey chengjoey deleted the feature/pipeline-clusterinfo branch July 16, 2021 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

4 participants