Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s-dns issue around mirrorbrain-db #361

Open
rgaudin opened this issue Feb 3, 2025 · 3 comments
Open

k8s-dns issue around mirrorbrain-db #361

rgaudin opened this issue Feb 3, 2025 · 3 comments
Labels
bug Something isn't working time-sensitive Must be adressed rapidly

Comments

@rgaudin
Copy link
Member

rgaudin commented Feb 3, 2025

the cronjob mb-update-hashes is not able to complete anymore. All of its runs fail like:

Hashing '/var/www/download.kiwix.org/zim/ted/ted_mul_potential_2025-02.zim'... done.
File 'zim/ted/ted_mul_potential_2025-02.zim' not in database. Not on mirrors yet? Will be inserted.
Hashing '/var/www/download.kiwix.org/zim/ted/ted_mul_poverty_2025-02.zim'... done.
File 'zim/ted/ted_mul_poverty_2025-02.zim' not in database. Not on mirrors yet? Will be inserted.
Hashing '/var/www/download.kiwix.org/zim/ted/ted_mul_public-health_2025-02.zim'... done.
File 'zim/ted/ted_mul_public-health_2025-02.zim' not in database. Not on mirrors yet? Will be inserted.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/mb/hashes.py", line 114, in check_db
    c = conn.mycursor
        ^^^^^^^^^^^^^
AttributeError: 'Conn' object has no attribute 'mycursor'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/sqlobject/postgres/pgconnection.py", line 221, in makeConnection
    conn = self.module.connect(**self.dsn_dict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/psycopg2/__init__.py", line 122, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.OperationalError: could not translate host name "mirrorbrain-db-service" to address: Name or service not known


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/mb", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mb/mb.py", line 2054, in main
    r = mirrordoctor.main()
        ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/cmdln.py", line 262, in main
    return self.cmd(args)
           ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/cmdln.py", line 285, in cmd
    retval = self.onecmd(argv)
             ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/cmdln.py", line 423, in onecmd
    return self._dispatch_cmd(handler, argv)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/cmdln.py", line 1124, in _dispatch_cmd
    return handler(argv[0], opts, *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mb/mb.py", line 1341, in do_makehashes
    hasheable.check_db(
  File "/usr/local/lib/python3.11/dist-packages/mb/hashes.py", line 116, in check_db
    c = conn.Hash._connection.getConnection().cursor()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/sqlobject/dbconnection.py", line 351, in getConnection
    conn = self.makeConnection()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/sqlobject/postgres/pgconnection.py", line 223, in makeConnection
    raise dberrors.OperationalError(
sqlobject.dberrors.OperationalError: could not translate host name "mirrorbrain-db-service" to address: Name or service not known
used connection string 'dbname=mirrorbrain user=mirrorbrain host=mirrorbrain-db-service port=5432'
stream closed EOF for zim/mb-update-hashes-manual-zdf-wjwt2 (mirrorbrain)

This mirrorbrain-db-service is provided by k8s and is one of its main goal. Not being able to resolve is very concerning.

Given I found this only for this task, I looked into mirrorbrain-db and could not find any issue. Other services using it (mirrorbrain-web mostly) are not complaining. Given its RAM request was low compared to usage, I significantly increased it and restarted.
The following run (manually triggered) ran for more than 5mn while others typically fail around 1/1.5mn but it's a single occurrence so it might be random.

@rgaudin rgaudin added bug Something isn't working time-sensitive Must be adressed rapidly labels Feb 3, 2025
@rgaudin
Copy link
Member Author

rgaudin commented Feb 3, 2025

The following run succeeded. Let's see of the next one goes (it runs every hour at :10)

@rgaudin
Copy link
Member Author

rgaudin commented Feb 3, 2025

Hasn't occurred since. Closing for now.

@rgaudin rgaudin closed this as not planned Won't fix, can't repro, duplicate, stale Feb 3, 2025
@rgaudin rgaudin mentioned this issue Feb 3, 2025
21 tasks
@benoit74 benoit74 reopened this Feb 10, 2025
@benoit74
Copy link
Collaborator

benoit74 commented Feb 10, 2025

Happened again 20h ago (same job, same error) but then next job was OK ...

@benoit74 benoit74 mentioned this issue Feb 10, 2025
21 tasks
@rgaudin rgaudin mentioned this issue Feb 17, 2025
21 tasks
@benoit74 benoit74 mentioned this issue Feb 24, 2025
21 tasks
@rgaudin rgaudin mentioned this issue Mar 3, 2025
21 tasks
@benoit74 benoit74 mentioned this issue Mar 10, 2025
21 tasks
@rgaudin rgaudin mentioned this issue Mar 17, 2025
20 tasks
@rgaudin rgaudin mentioned this issue Mar 31, 2025
20 tasks
@benoit74 benoit74 mentioned this issue Apr 7, 2025
20 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working time-sensitive Must be adressed rapidly
Projects
None yet
Development

No branches or pull requests

2 participants