Skip to content

The sto-worker service fails to start due to concurrent S3 bucket creation during startup #8043

@giancarloromeo

Description

@giancarloromeo

Is there an existing issue for this?

  • I have searched the existing issues

Which deploy/s?

Observed in Staging but potentially everywhere.

Current Behavior

@matusdrobuliak66 reported a failure in storage worker startup. After a staging release, the service (that failed to start) has been reverted to the previous version.

Expected Behavior

No response

Steps To Reproduce

No response

Anything else?

log_level=WARNING | log_timestamp=2025-07-04 08:06:58,431 | log_source=servicelib.fastapi.cancellation_middleware:__init__(50) | log_uid=None | log_oec=None| log_trace_id=0 | log_span_id=0 | log_resource.service.name= | log_trace_sampled=False] | log_msg=CancellationMiddleware is in use, in case of client disconection, FastAPI BackgroundTasks will be cancelled too!

Exception in thread app_server_init:

Traceback (most recent call last):

  File "/home/scu/.venv/lib/python3.11/site-packages/aws_library/s3/_error_handler.py", line 76, in wrapper

    return await func(self, *args, **kwargs)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/aws_library/s3/_client.py", line 148, in create_bucket

    await self._client.create_bucket(**create_bucket_config)

  File "/home/scu/.venv/lib/python3.11/site-packages/aiobotocore/client.py", line 412, in _make_api_call

    raise error_class(parsed_response, operation_name)

botocore.exceptions.ClientError: An error occurred (OperationAborted) when calling the CreateBucket operation: A conflicting conditional operation is currently in progress against this resource. Please try again.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner

    self.run()

  File "/usr/local/lib/python3.11/threading.py", line 982, in run

    self._target(*self._args, **self._kwargs)

  File "/home/scu/.venv/lib/python3.11/site-packages/celery_library/signals.py", line 43, in _init

    loop.run_until_complete(app_server.lifespan(startup_complete_event))

  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete

    return future.result()

           ^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/servicelib/fastapi/celery/app_server.py", line 22, in lifespan

    async with LifespanManager(

  File "/home/scu/.venv/lib/python3.11/site-packages/asgi_lifespan/_manager.py", line 102, in __aenter__

    await self._exit_stack.aclose()

  File "/usr/local/lib/python3.11/contextlib.py", line 687, in aclose

    await self.__aexit__(None, None, None)

  File "/usr/local/lib/python3.11/contextlib.py", line 745, in __aexit__

    raise exc_details[1]

  File "/usr/local/lib/python3.11/contextlib.py", line 728, in __aexit__

    cb_suppress = await cb(*exc_details)

                  ^^^^^^^^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/asgi_lifespan/_concurrency/asyncio.py", line 80, in __aexit__

    await self.task

  File "/home/scu/.venv/lib/python3.11/site-packages/asgi_lifespan/_concurrency/asyncio.py", line 63, in run_and_silence_cancelled

    await self.coroutine()

  File "/home/scu/.venv/lib/python3.11/site-packages/asgi_lifespan/_manager.py", line 73, in run_app

    await self.app(scope, self.receive, self.send)

  File "/home/scu/.venv/lib/python3.11/site-packages/asgi_lifespan/_manager.py", line 13, in app_with_state

    await app(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__

    await super().__call__(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/applications.py", line 112, in __call__

    await self.middleware_stack(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 152, in __call__

    await self.app(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 681, in __call__

    return await self.app(scope, receive, send)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/middleware/base.py", line 100, in __call__

    await self.app(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/servicelib/fastapi/cancellation_middleware.py", line 57, in __call__

    await self.app(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/middleware/gzip.py", line 22, in __call__

    await self.app(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 48, in __call__

    await self.app(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__

    await self.middleware_stack(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/routing.py", line 724, in app

    await self.lifespan(scope, receive, send)

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/routing.py", line 693, in lifespan

    async with self.lifespan_context(app) as maybe_state:

  File "/usr/local/lib/python3.11/contextlib.py", line 210, in __aenter__

    return await anext(self.gen)

           ^^^^^^^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 133, in merged_lifespan

    async with original_context(app) as maybe_original_state:

  File "/usr/local/lib/python3.11/contextlib.py", line 210, in __aenter__

    return await anext(self.gen)

           ^^^^^^^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/fastapi/routing.py", line 133, in merged_lifespan

    async with original_context(app) as maybe_original_state:

  File "/usr/local/lib/python3.11/contextlib.py", line 210, in __aenter__

    return await anext(self.gen)

           ^^^^^^^^^^^^^^^^^^^^^

  File "/home/scu/.venv/lib/python3.11/site-packages/fastapi_pagination/api.py", line 414, in lifespan

    async with _original_lifespan_context(app) as maybe_state:

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/routing.py", line 569, in __aenter__

    await self._router.startup()

  File "/home/scu/.venv/lib/python3.11/site-packages/starlette/routing.py", line 670, in startup

    await handler()

  File "/home/scu/.venv/lib/python3.11/site-packages/simcore_service_storage/modules/s3.py", line 49, in _on_startup

    await client.create_bucket(

  File "/home/scu/.venv/lib/python3.11/site-packages/aws_library/s3/_error_handler.py", line 84, in wrapper

    raise _map_botocore_client_exception(exc, **kwargs) from exc

aws_library.s3._errors.S3AccessError: Unexpected error while accessing S3 backend

Metadata

Metadata

Labels

bugbuggy, it does not work as expected

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions