Skip to content

dumpgenerator.py --xmlrevisions creates Error:list index out of range on pokewiki.de #430

@GERZAC1002

Description

@GERZAC1002

Full comand that was used:

./dumpgenerator.py --xmlrevisions --images --xml --curonly https://pokewiki.de --namespace 0
I used the command without '--namespace 0' before with the same result, i only had to add it for reproducing the error while not putting to much stress on the wiki page it self.

Expected behaviour:

creating a dump of https://pokewiki.de

Actual behaviour after a a few minutes:

Traceback (most recent call last):
 File "./dumpgenerator.py", line 2569, in <module>
   main()
 File "./dumpgenerator.py", line 2561, in main
   createNewDump(config=config, other=other)
 File "./dumpgenerator.py", line 2128, in createNewDump
   generateXMLDump(config=config, titles=titles, session=other['session'])
 File "./dumpgenerator.py", line 741, in generateXMLDump
   for xml in getXMLRevisions(config=config, session=session, start=start):
 File "./dumpgenerator.py", line 877, in getXMLRevisions
   print "        %d more revisions listed, until %s" % (len(revids), revids[-1])
IndexError: list index out of range

Full log:
dumgenerator.py_xmlrevisions.log

Tail of the output file:

{{Karte Designs/Zeile|typ=Farblos|Damythir-V (Time Gazer 059)|illus=aky CG Works|seltenheit=RR|num=1}}
{{Karte Designs/Zeile|typ=Farblos|Damythir-V (Time Gazer 076)|illus=aky CG Works|seltenheit=SR|num=2}}
&lt;/div&gt;

[[en:Wyrdeer V (Time Gazer 59)]]
[[ja:&#12450;&#12516;&#12471;&#12471;V (S10D)]]</text>
      <sha1>ip8lev6wdaqnyxpyw926h46ktlmtoup</sha1>
    </revision>
  </page> 

Quick 'integrity' check on the output file

 grep "<title>" -c *-current.xml ; grep "<page" -c *-current.xml ; grep "</page>" -c *-20220412-current.xml 
2231
2231
2231

Number of page titles in side *-titles.txt: 86796

Test without '--xmlrevisions'

./dumpgenerator.py --xmlrevisions --images --xml --curonly https://pokewiki.de --namespace 0
Checking API... https://www.pokewiki.de/api.php
API is OK: https://www.pokewiki.de/api.php
Checking index.php... https://www.pokewiki.de/index.php
index.php is OK
#########################################################################
# Welcome to DumpGenerator 0.4.0-alpha by WikiTeam (GPL v3)                   #
# More info at: https://github.com/WikiTeam/wikiteam                    #
#########################################################################

#########################################################################
# Copyright (C) 2011-2022 WikiTeam developers                           #

# This program is free software: you can redistribute it and/or modify  #
# it under the terms of the GNU General Public License as published by  #
# the Free Software Foundation, either version 3 of the License, or     #
# (at your option) any later version.                                   #
#                                                                       #
# This program is distributed in the hope that it will be useful,       #
# but WITHOUT ANY WARRANTY; without even the implied warranty of        #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         #
# GNU General Public License for more details.                          #
#                                                                       #
# You should have received a copy of the GNU General Public License     #
# along with this program.  If not, see <http://www.gnu.org/licenses/>. #
#########################################################################

Analysing https://www.pokewiki.de/api.php
Trying generating a new dump into a new directory...
Loading page titles from namespaces = 0
Excluding titles from namespaces = None
1 namespaces found
    Retrieving titles in the namespace 0
    86795 titles retrieved in the namespace 0
Titles saved at... pokewikide-20220412-titles.txt
86795 page titles loaded
https://www.pokewiki.de/api.php
HTTP Error 404.
Not found. Is Special:Export enabled for this wiki?
https://www.pokewiki.de/index.php?action=submit&curonly=1&limit=1&pages=Main_Page&title=Special%3AExport

After using the pull request #280 back from 2016 and integrating it into a new version(pull request #429) i managed to get a full dump of the mentioned wiki.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions