I'd like to request adding the E621 IQDB data to be added to the daily Database Exports. This seems like something that could easily be self-hosted but in order to setup IQDB we need the source images to generate the HAAR signatures (which would involve downloading tons of files), OR ideally we just download the HAAR signatures from the existing E621 IQDB database and import them into self-hosted IQDB instances.
E621 IQDB Search has upload whitelists, rate limits, and requires CSRF authenticity tokens which impedes automated, especially bulk, reverse image searching. Other third-party reverse image searches appear to also have significant rate limits for bulk searching. This makes sense given I imagine it uses a bit of compute to do and I understand why its rate limited and somewhat locked down.
From what I can see in the IQDB Source Code, the services stores searchable hashes alongside a corresponding Post ID, the hashes primarily residing under separate RGB channels. Given that E621 is self-hostable and a export of all posts is already provided, I'd like the IQDB HAAR hashes/signatures to also be provided which would permit efficient self-hosted reverse image searches of E621 posts from the database exports.
Option 1: Add IQDB sqlite dump to Database Exports
- Simply dump the sqlite database and serve it with the export so we could initialize a self-hosted IQDB with the sqlite db
- Probably best to convert to CSV instead and have it be iqdb-[date].csv.gz
Option 2: Add iqdb_data column to the posts.csv of Database Exports
- Reuses existing CSV, adds new CSV column with the information needed to sync into a self-hosted IQDB instance and perform reverse searches
- Encode the data for the post we have in the IQDB, or maybe hit the images/[post_id] endpoint and feed the corresponding .hash into the iqdb_haar_hash column in the CSV. Latter would require additional decoding
My preference is on Option 1. Its simplest, could easily be converted to a sqlite INSERT, and wont mess with the other existing CSV exports. To promote self hosting, we'd likely want an additional iqdb endpoint to add an image by a known hash but this is something I could do on my own time.
Thanks!