Skip to content

Conversation

@rvrajvyas
Copy link

📋 Scenario Overview

  • Name: dns-resolution-broken
  • Difficulty: intermediate
  • Learning Path: production-sre
  • Technologies: [linux, dns, networking, troubleshooting, system-administration]

🔧 What's Broken

In this scenario, the system's DNS configuration (/tmp/clouddojo-resolv.conf) contains invalid and unreachable nameservers, causing all domain name lookups to fail.
Applications can connect using IP addresses but cannot resolve any domain names, leading to complete DNS resolution failure across services.

✅ Solution Summary

To fix the issue:

  1. Inspect the DNS configuration using cat /tmp/clouddojo-resolv.conf.
  2. Identify invalid nameservers (e.g., IPs with 999 or local-only addresses).
  3. Replace them with valid public DNS servers such as:
    • Google DNS → 8.8.8.8, 8.8.4.4
    • Cloudflare DNS → 1.1.1.1, 1.0.0.1
  4. Add at least two valid nameservers for redundancy.
  5. Verify that domain resolution works using nslookup google.com or dig cloudflare.com.

Once fixed, domain names should resolve successfully, restoring application connectivity.

🧪 Testing Checklist

  • Automated tests pass (python tests/test_scenarios.py)
  • Manual testing completed in dojo interface
  • All hints tested and work correctly
  • Completion story appears when solved
  • Reset function returns to broken state

🤖 AI Assistance (if applicable)

AI Tools Used:

  • ChatGPT (version: GPT-5)

Level of AI Assistance:

  • Code generation and structure
  • Story writing and hints
  • Documentation

Human Review Process:

  • Manually reviewed all AI-generated code
  • Verified the scenario end-to-end
  • Ensured educational and progressive hints
  • Confirmed adherence to project coding and narrative standards

📝 Additional Notes

  • Scenario emphasizes real-world DNS troubleshooting skills used by SREs and Network Engineers.
  • Includes multiple realistic hints guiding learners from identification to resolution.
  • Tests include checks for:
    • Valid nameservers
    • Redundancy (at least two configured)
    • No invalid IPs
    • Working resolution simulation
  • Reset function restores the broken configuration cleanly.

By submitting this PR, I confirm that:

  • I have read and followed the CONTRIBUTING.md guidelines
  • My code follows the project's coding standards
  • I have tested my changes thoroughly
  • If using AI assistance, I have reviewed and validated all generated content

@datakaitech
Copy link
Owner

hey @rvrajvyas, In this the conf file is broken but it will not refer the /tm/resolve.conf when we try to hit dig or host
either you need to source that file before running these commands ( but its not recommended because we are not touching the host system other than /tmp folder ) or you can try to implement this solution inside a containerized environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants