The evaluation kit contains all the queries in DocExplore dataset and an evaluation program (for Ubuntu 18.04).

This compiled program can be used to evaluate image retrieval and pattern spotting task. Depending on your objective, please provide the input data accordingly. To see all the option, please use:

./main -h

How to prepare the input file for evaluation purpose?

It has to be a “txt” file where each line corresponds to each query.

  1. Image retrieval task: each line starts with query_id: imageId1 imagId2 ….
  2. Pattern spotting task: each line starts with query_id: imageId_x1_y1_x2_y2 imageId2_x1_y1_x2_y2 …., where x1, y1, x2, y2 corresponds to the top-left and bottom-right coordinates of the image. We use python style for row (y-axis and column (x-axis) indexing.

Each line has to be terminated by “\r”. There is no need to give full image id (ex. page1002.jpg), instead just use 1002.

Please check the two example files included in the evaluation kit as a reference to prepare your solution.

4. The performance is reported as mean Average Precision. 

Notice: The query id in this evaluation kit may be different from the previous version. We had trouble retrieving the query id when extracting the image from the DocExplore tool. Please use the new query set.

Known issues:

  1. ‘GLIBC_*.**’ not found: simply upgrading your glibc using apt-get install libc6.