Topic: [Feature request] option to search for threshold of wanted tags

Posted under Site Bug Reports & Feature Requests

by that i mean, if you're searching like 10 tags at once, probably nothing is gonna have them all. but if you put a ~ in front of all of the tags, it probably won't include a lot of what you're looking for. there should be a middle ground between a post only needing at least 1 of the tags or requiring every single tag. like say, at least 3 of the tags

the formatting of this could be something like ( tag1 tag2 tag3 tag4 tag5 ):3, but that's just a suggestion of course

I know this sort of thing is possible on FurAffinity by doing something like ( totodile | croconaw | feraligatr ) ( chikorita | bayleef | meganium ) but I'm not sure how well e621's search can even handle that.

SCTH

Member

Technically possible with the current nested search implementation, but you'd have to lay out a whole lot of combinations of tags and it would quickly hit the limit.

Say you wanted at least two of tags felid, duo, and mouse; you could do
~( felid ~duo ~mouse ) ~( duo mouse )
As you go beyond wanting at least two tags, or more tag options, though, it would get exponentially longer.
Doing actual numeric counting of tags directly isn't really possible with the current search system and would be very hard to implement, but the current system works quite well for stuff like one tag in category A and one tag in category B. Not many people know about it, though. See the cheat sheet basic and advanced tag group sections.

Updated

Aacafah

Moderator

anthonyhotel said:
by that i mean, if you're searching like 10 tags at once, probably nothing is gonna have them all. but if you put a ~ in front of all of the tags, it probably won't include a lot of what you're looking for. there should be a middle ground between a post only needing at least 1 of the tags or requiring every single tag. like say, at least 3 of the tags

the formatting of this could be something like ( tag1 tag2 tag3 tag4 tag5 ):3, but that's just a suggestion of course

I discussed a similar proposal here, though this would be far more feasible than that; without calculating relevance scores, it'd be far more performant, & I could make a simple metatag that allows assigning the minimum_should_match parameter for the entirety of that (sub)query (e.g. minimum_should_match:3 ~tag1 ~tag2 ~tag3 ~tag4 ~tag5 would achieve the desired result, & minimum_should_match:3 ~tag1 ~tag2 ~( minimum_should_match:3 ~tag3 ~tag4 ~tag5 ) would be equivalent to tag1 tag2 tag3 tag4 tag5).

That said, despite better prospects, I can't say if it'd be performant enough to be feasible, and I'd say it's not much more useful than the currently-supported solutions proposed prior for most practical cases.

I'd also like to have this personally (I know for certain I'd get a lot of use out of it), but I doubt that it's tenable. Per-search performance was also a concern when I added support for grouped searches, but not only did it not take too much additional processing, with some careful optimization, the way I implemented it actually improved performance for searches that don't use any groups, so it actually decreased the overall load on the server, & I doubt that's happening for this.

The site is open source, so anyone who's interested can get the code from our GitHub page & try to implement & profile it themselves, but I don't think we'd have the time to take a stab at this ourselves for a good long while, even with the improved prospects for feasibility. That said, that's the same situation grouped searches were in when I came along, so someone could always come along & give it a shot, but it's unlikely to come from us, especially with how time-consuming & potentially futile testing & profiling would be, though implementation should be fairly straightforward. If anyone wants to take a crack at it, DMail me & I can give you advice & a rough implementation; the rest would be up to you.

after thinking about it for a bit i got this working for the 10 choose 3 example within the current system

~( 1 ~( ( ~2 ~3 ) ( ~4 ~5 ) ) ~( 2 3 ) ~( 4 5 ) ) ~( 2 3 ~4 ~5 ) ~( ~2 ~3 4 5 ) ~( 6 ~( ( ~7 ~8 ) ( ~9 ~X ) ) ~( 7 8 ) ~( 9 X ) ) ~( 7 8 ~9 ~X ) ~( ~7 ~8 9 X ) ~( ~( ( ~1 ~2 ~3 ) ( ~4 ~5 ) ) ~( 1 ~2 ~3 ) ~( 2 3 ) ~( 4 5 ) ~6 ~7 ~8 ~9 ~X ) ~( ~1 ~2 ~3 ~4 ~5 ~( ( ~6 ~7 ~8 ) ( ~9 ~X ) ) ~( 6 ~7 ~8 ) ~( 7 8 ) ~( 9 X ) )

unfortunately the restriction of not being able to search more than 40 tags applies to duplicates as well, so this isn't actually usable, but it's interesting that just using nested groups can get it even this short

scth said:
Technically possible with the current nested search implementation, but you'd have to lay out a whole lot of combinations of tags and it would quickly hit the limit.

Say you wanted at least two of tags felid, duo, and mouse; you could do
~( felid ~duo ~mouse ) ~( duo mouse )
As you go beyond wanting at least two tags, or more tag options, though, it would get exponentially longer.
Doing actual numeric counting of tags directly isn't really possible with the current search system and would be very hard to implement, but the current system works quite well for stuff like one tag in category A and one tag in category B. Not many people know about it, though. See the cheat sheet basic and advanced tag group sections.

Wow, that does exist now? I didn't realize it ever got added!

aacafah said:
It's almost reached its one year anniversary. Kinda sad I haven't gotten many other big things done since then, but hopefully I'll be changing that soon.

On FA it's called a quorum search. The FA syntaxe is "tag1 tag2 tag3 tag4 tag5"/x where x is the number of tags a submission needs to be returned. There's also no limit on how many tags they allow in a quorum search.

I think it's a feature of the database they use, but don't quote me on that.