Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign airlock to reduce number of storage accounts used #4358

Open
jonnyry opened this issue Feb 11, 2025 · 4 comments
Open

Redesign airlock to reduce number of storage accounts used #4358

jonnyry opened this issue Feb 11, 2025 · 4 comments
Labels

Comments

@jonnyry
Copy link
Collaborator

jonnyry commented Feb 11, 2025

The airlock uses a large number of storage accounts. The number of accounts (particularly the workspace ones) are a bind on scalability, and also up the cost.

Is it possible to consolidate some of these accounts, and use containers to segregate data instead?


Per account:

  • Private endpoint $7.30/account/month
  • Defender scanning $10/account/month

E.g. TRE with 10 workspaces = 6 core airlock accounts, 50 workspace airlock accounts

Airlock storage accounts - core

Name Description
st + airlockp + <TRE_ID> Airlock Processor
st + alexapp + <TRE_ID> Airlock Export Approved
st + alimblocked + <TRE_ID> Airlock Import Blocked
st + alimex + <TRE_ID> Airlock Import External
st + alimip + <TRE_ID> Airlock Import In Progress
st + alimrej + <TRE_ID> Airlock Import Rejected

Airlock storage accounts - per workspace

Name Description
st + alexblocked + ws + <WS_ID> Airlock Export Blocked
st + alexint + ws + <WS_ID> Airlock Export Internal
st + alexip + ws + <WS_ID> Airlock Export In Progress
st + alexrej + ws + <WS_ID> Airlock Export Rejected
st + alimapp + ws + <WS_ID> Airlock Import Approved
@TonyWildish-BH
Copy link
Contributor

It's not clear to me that the rej SAs serve any purpose at all. If a file is rejected, why is there any need to keep it at all? What happens to it after it goes to the rej SA, does it just sit there gathering dust until the workspace is destroyed? Can't we just get rid of that one altogether?

@jonnyry
Copy link
Collaborator Author

jonnyry commented Feb 13, 2025

Indeed - or if necessary to keep could there be a rejected container, rather than a whole seperate storage account?

In fact could some of the accounts that share the same networking be consolidated, and containers used instead of seperate accounts... e.g.

I believe RBAC can be applied at the container level.

Image

@marrobi
Copy link
Member

marrobi commented Feb 13, 2025

Rejected was there in case of false positives, the fact that occasionally there might be a need to go retrieve the data.

I'm trying to recall why they were separate accounts, I agree if the share the networking and RBAC can be done at a container level, that might work.

@LizaShak (or even @eladiw ), can you remember?

@West-P
Copy link

West-P commented Feb 13, 2025

I believe RBAC can be applied at the container level.

To do RBAC at the container level the storage accounts need Hierarchical namespace Enabled and the storage account needs to be a Data Lake Storage Gen2 I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

4 participants