Skip to content

urllib.error.ContentTooShortError: <urlopen error retrieval incomplete: got only 114688 out of 151399 bytes> #2

@vkosuri

Description

@vkosuri

While running tests it failed it's likely that the VPN terminated the socket. You may need to implement retry/resume capability in your program.

In order to work for socket timeout, retry multiple time

         for url in img_url_list:
-            # download each image and save to the input dir
+            # download each image and save to the input dir
             img_filename = urlparse(url).path.split('/')[-1]
-            urlretrieve(url, self.input_dir + os.path.sep + img_filename)
+            try:
+                urlretrieve(url, self.input_dir + os.path.sep + img_filename)
+            except socket.timeout:
+                count = 1
+                while count <= 5:
+                    try:
+                        urlretrieve(url, self.input_dir +
+                                    os.path.sep + img_filename)
+                        break
+                    except socket.timeout:
+                        err_info = ('Reloading for %d time' %
+                                    count if count == 1 else
+                                    'Reloading for %d times' % count)
+                        print(err_info)
+                        count += 1
+                if count > 5:
+                    print("downloading picture fialed!")

Stacktrace:

(venv) C:\Users\kosuri\Desktop\asyncio\python-concurrency-getting-started>pytest test_thumbnail_maker.py
================================================= test session starts =================================================
platform win32 -- Python 3.9.5, pytest-7.0.1, pluggy-1.0.0
rootdir: C:\Users\kosuri\Desktop\asyncio\python-concurrency-getting-started
collected 1 item

test_thumbnail_maker.py F                                                                                        [100%]

====================================================== FAILURES =======================================================
________________________________________________ test_thumbnail_maker _________________________________________________

    def test_thumbnail_maker():
        tn_maker = ThumbnailMakerService()
>       tn_maker.make_thumbnails(IMG_URLS)

test_thumbnail_maker.py:34:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
thumbnail_maker.py:71: in make_thumbnails
    self.download_images(img_url_list)
thumbnail_maker.py:31: in download_images
    urlretrieve(url, self.input_dir + os.path.sep + img_filename)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

url = 'https://dl.dropboxusercontent.com/s/1xlgrfy861nyhox/pexels-photo-324655.jpeg'
filename = '.\\incoming\\pexels-photo-324655.jpeg', reporthook = None, data = None

    def urlretrieve(url, filename=None, reporthook=None, data=None):
        """
        Retrieve a URL into a temporary location on disk.

        Requires a URL argument. If a filename is passed, it is used as
        the temporary file location. The reporthook argument should be
        a callable that accepts a block number, a read size, and the
        total file size of the URL target. The data argument should be
        valid URL encoded data.

        If a filename is passed and the URL points to a local resource,
        the result is a copy from local file to new file.

        Returns a tuple containing the path to the newly created
        data file as well as the resulting HTTPMessage object.
        """
        url_type, path = _splittype(url)

        with contextlib.closing(urlopen(url, data)) as fp:
            headers = fp.info()

            # Just return the local path and the "headers" for file://
            # URLs. No sense in performing a copy unless requested.
            if url_type == "file" and not filename:
                return os.path.normpath(path), headers

            # Handle temporary file setup.
            if filename:
                tfp = open(filename, 'wb')
            else:
                tfp = tempfile.NamedTemporaryFile(delete=False)
                filename = tfp.name
                _url_tempfiles.append(filename)

            with tfp:
                result = filename, headers
                bs = 1024*8
                size = -1
                read = 0
                blocknum = 0
                if "content-length" in headers:
                    size = int(headers["Content-Length"])

                if reporthook:
                    reporthook(blocknum, bs, size)

                while True:
                    block = fp.read(bs)
                    if not block:
                        break
                    read += len(block)
                    tfp.write(block)
                    blocknum += 1
                    if reporthook:
                        reporthook(blocknum, bs, size)

        if size >= 0 and read < size:
>           raise ContentTooShortError(
                "retrieval incomplete: got only %i out of %i bytes"
                % (read, size), result)
E           urllib.error.ContentTooShortError: <urlopen error retrieval incomplete: got only 114688 out of 151399 bytes>

C:\Program Files\Python39\lib\urllib\request.py:278: ContentTooShortError
=============================================== short test summary info ===============================================
FAILED test_thumbnail_maker.py::test_thumbnail_maker - urllib.error.ContentTooShortError: <urlopen error retrieval in...
============================================ 1 failed in 63.89s (0:01:03) =============================================

(venv) C:\Users\kosuri\Desktop\asyncio\python-concurrency-getting-started>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions